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TECHNICAL FIELD OF INVENTION 
This invention relates to DNA sequences, 
recombinant DNA molecules and processes for producing 
soluble T4 proteins. More particularly, this invention 
relates to DNA sequences that are characterized in 
that they code on expression in an appropriate uni- 
cellular host for soluble forms of T4, the receptor 
on the surface of T4* lymphocytes, or derivatives 
thereof. In accordance with this invention, the DNA 
sequences, recombinant DNA molecules and processes 
15 of this invention may be employed to produce soluble 

T4 essentially free of other proteins of human origin. 
This soluble protein may then advantageously be used 
in the immunotherapeutic, prophylactic, and diag- 
nostic compositions and methods of this invention. 
2 0 The soluble T4 protein-based immunothera- 

peutic compositions and methods of this invention 
are useful in treating immunodef icient patients suf- 
fering from diseases caused by infective agents whose 
primary targets are T4* lymphocytes. According to a 
25 preferred embodiment, this invention relates to solu- 
ble T4 protein-based compositions and methods which 
useful in preventing, treating or detecting 
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acquired immune deficiency syndrome, AIDS related 
complex and EIV infection. 

BACKGROUND ART 

The class of immune regulatory cells known 
as T cell lymphocytes can be divided into two broad 
functional classes, the first class comprising T 
helper or inducer cells — which mediate T cell pro- 
liferation, lymphokine release and helper cell inter- 
actions for Ig release, and the second class compris- 
ing T cytotoxic or suppressor cells — which parti- 
cipate in T cell-mediated killing and immune response 
suppression. In general, these two classes of 
lymphocytes are distinguished by expression of one 
of two surface glycoproteins: T4 (m.w. 55.000-62,000 
15 daltons) which is expressed on T helper or inducer 
cells, probably as a monomeric protein, or T8 (m.w. 
32.000 daltons) which is expressed on T cytotoxic or 
suppressor cells as a dimeric protein. 

The primary structures of T4 and T8 have 
been deduced from their respective cDNA sequences 
[P. J. Maddon et al., "The Isolation and Nucleotide 
Sequence Of A cDNA Encoding The T Cell Surface Protein 
T4: A New Member Of The Immunoglobulin Gene Family", 
Cell, 42. pp. 93-104 (1985); D. R. Littman et al., 
"The Isolation And Sequence Of The Gene Encoding T8: 
A Molecule Defining Functional Classes Of T Lympho- 
cytes", cell, 40, pp. 237-46 (1985)]. Both predicted 
protein sequences define molecules with domains 
expected for surface antigens, including transmem- 
brane and intracytoplasmic domains at the carboxyl 
end of the protein. In addition, both proteins con- 
tain an amino terminal region which shows striking 
homology to immunoglobulin and T cell receptor 
variable regions and which might function during 
35 target cell recognition [ Maddon et al. . supra] . 
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m immunocompetent individuals. T4 lympho- 
cyte, interact with other specialized cell types of 
the immune system to confer immunity to or defense 
against infection IE. L. Reinherz and * ' ^ 
Schlossman, "The Differentiation Function Of Hum.., 
T-C.ll.". Cell. 19. FP- S21-27 (1980)]. »-e specif 
fically. T4 lymphocytes stimulate production c. grow., 
factors which are critical to a functional immune 
system. For example, they act to stimulate B cells, 
the descendants of hemopoietic stem cells, which 
promote the production of defensive antibodies 
They also activate macrophages (filler cells ) to 
attack infected or othervise abnormal host cells and 
they induce monocytes ("scavenger cells") to encompass 
15 and destroy invading microbes. 

It has been found that the primary target 
of or receptor for certain infective agents is the 
T4 surface protein. These agents include, for 
example, viruses and retroviruses, when T4 lympho- 
cytes are exposed to such agents, they are rendered 
nonfunctional. As a result, the host's complex immune 
defense system is destroyed and the host becomes 
susceptible to a wide range of opportunistic infec- 
tions. 

Such immunosuppression is seen m patients 
suffering from acquired immune deficiency syndrome 
("AIDS"). AIDS is a disease characterized by severe 
or. typically, complete immunosuppression and 
attendant host susceptibility to a wide range of 
opportunistic infections and malignancies. In some 
cases. AIDS infection is accompanied by central 
nervous system disorders. Complete clinical mani- 
festation of AIDS is usually preceded by AIDS 
related complex ("ARC"), a syndrome accompanied by 
svmptoms such as persistent generalized lymphadeno- 
pathy. fev r and weight loss. The human immunod - 
ficiency virus ("HIV") retrovirus is thought to be 
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the etiological agent responsible for AIDS infection 
and its precursor, ARC [ M . G. Samgadharan et al . . 
"Detection. Isolation And Continuous Production Of 
Cytopathic Retroviruses (HTLV-IH) From Patients 
With AIDS And Pre-AIDS", Science . 224, PP . 4S7-506 
(1984)] .* 

Between 85 and 100% of the AIDS/ARCS popu- 
lation test seropositive f or HIV {G. N. Shaw et al . 
"Molecular Characterization Of Human T-Cell Leukemia 
(Lymphotropic) Virus Type mm The Acquired Immune 
Deficiency Syndrome". Science . 226, pp. 1165-70 
(1984)]. The number of adults in the United States 
infected with HIV has been estimated to be between 1 
and 2.5 million [D. Barnes. "Strategies For An AIDS 
Vaccine", Science , 233, pp. 1149-53 (1986); M . Rees, 
"The Sombre View of AIDS", Nature, 326, pp. 343-45 
(1987)]. These estimates include 64,900 individuals 
who do not belong to an identified group at risk for 
AIDS [S. L. SivaJc and«C. P. Wormser, "How Common Is 
HTLV-lii infection In The United States?". New Enq 
J - M * d - ' 313 ' P- 1352 (1985)]. The apparent annual 
rate of diagnosis for those infected with HIV virus 
is between l and 2% - a rate which may increase 
significantly in future years. 

The genome of retroviruses, such as HIV, 
contains three regions encoding structural proteins. 
The oaa region encodes the core proteins of the 
virion. The pol region encodes the virion RNA-depen- 
dent DNA polymerase (reverse transcriptase). The 
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gnv region encodes the major glycoprotein found in 
the membrane envelope of the virus and in the cyto- 
plasmic membrane of infected cells. The capaci tv cf 
the virus to attach to target cell receptors and to 
5 cause fusion of cell membranes are two EIV proper- 
ties controlled by the env gene. These properties 
are believed to play a fundamental role in the patho- 
genesis of the virus. 

HIV env proteins arise from a precursor 

10 polypeptide that, in mature form, is cleaved into a 
large heavily glycosylated exterior membrane protein 
of about 481 amino acids — gpl20 — and a smaller 
transmembrane protein of about 345 amino acids which 
may be glycosylated gp41 [L. Ratner et al . , 

15 "Complete Nucleotide Sequence Of The AIDS Virus, 
HTLV-III", Nature , 313, pp. 277-84 (1985)]. 

The host range of the HIV virus is asso- 
ciated with cells which bear the surface glycopro- 
tein T4. Such cells include T4 lymphocytes and 

20 brain cells [P. J . Maddon et al . , "The T4 Gene Encodes 
The AIDS Virus Receptor And Is Expressed In The 
Immune System And The Brain" , Cell , 47, pp. 333-48 
(1986)]. Upon infection of a host by HIV virus, the 
T4 lymphocytes are rendered non- functional . The 

25 progression of AIDS/ARCS syndromes can be correlated 
with the depletion of T4* lymphocytes, which display 
the T4 surface glycoprotein. This T cell depletion, 
with ensuing immunological compromise, may be attri- 
butable to both recurrent cycles of infection and 

30 lytic growth from cell-mediated spread of the virus. 
In addition, clinical observations suggest that the 
HIV virus is directly responsible for the central 
nervous system disorders seen in many AIDS patients. 

The tropism of the HIV virus for T4* cells 

35 is believed to be attributed to the role of the T4 
cell surface glycoprotein as the membrane- anchor d 
virus recept r. Because T4 behaves as the HIV virus 
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receptor, its extracellular sequence probably plays 
a direct role in binding HIV . More specifically, it 
is believed that HIV envelope selectively binds to 
the T4 epitope(s), using this interaction to initiate 
5 entry into the host cell [A. G. Dalgelish et al . , 
"The CD4 (T4) Antigen Is An Essential Component Of 
The Receptor For The AIDS Retrovirus" , Nature , 312, 
pp. 763-67 (1984); D. Klatzmann et al., "T-Lymphocyte 
T4 Molecule Behaves As The Receptor For Human Retro- 
10 virus LAV, Nature, 312. pp. 767-68 (1984)]. Accord- 
ingly, cellular expression of T4 is believed to be 
sufficient for HIV binding, with the T4 protein 
serving as a receptor for the HIV virus. 

The T4 tropism of the HIV virus has been 
15 demonstrated in vitro. When HIV virus isolated from 
AIDS patients is cultured together with T helper 
lymphocytes preselected for surface T4, the lympho- 
cytes are efficiently infected, display cytopathic 
effects, including multinuclear syncytia formation 
and are killed by lytic growth [D. Klatzmann et al . , 
"Selective Tropism Of Lymphadenopathy Associated 
Virus (LAV) For Helper- Inducer T Lymphocytes", 
Science , 225, pp. 59-63 (1984); F. Wong-Staal and 
R. C. Gallo, "Human T-Lymphotropic Retroviruses", 
25 Nature , 317, pp. 395-403 (1985)]. It has been demon- 
strated that a cloned cDNA version of human T4. when 
expressed on the surface of trans fected cells from 
non-T cell lineages, including murine and fibroblas- 
toid cells, endows those cells with the ability to 
30 bind HIV [P. j. Madden et al . , "The T4 Gene Encodes 

The AIDS Virus Receptor And Is Expressed In The Immune 
System And The Brain", Cell . 47. pp. 333-48 (1986)]. 

During the course of HIV infection, the 
host mounts both a humoral and a cellular immune 
35 response t the virus. These responses include the 
appearance of antibodies which bind to a number of 
viral products and which exhibit neutralizing effect 
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or antibody dependent cellular cytotoxic function. 
[M. Guroff-Robert «t al . , "ETLV- 1 1 1 -Neutralizing 
Antibodies In Patients With AIDS And AIDS-Relatec 
Complex". Nature. 316, pp. 72-74 (1985); D. D. F. 
Barm et al . . "Virus Envelope Protein Of HTLV-III 
Represents Major Target Antigen For Antibodies in 
AIDS Patients", Science , 228. pp. 1094-96 (1985); 
A. B. Rook et al.. "Sera From HTLV-III/LAV Antibody 
Positive Individuals Mediate Antibody Dependent 
Cellular Cytotoxicity Against HTLV-III/LAV Infected 
T cells", J. Immunol. , 138, pp. 1064-68 (1987)]. 
Epitopes of the HIV envelope have been identified as 
important determinants in eliciting a neutralizing 
antibody response. And, determinants in antibody 
15 dependent cellular cytotoxicity ("ADCC") activity 
include HIV env and. possibly, 333 epitopes. 

in the absence to date of effective treat- 
ments for AIDS, many efforts have centered on preven- 
tion of the disease. Such preventative measures 
include HIV antibody screening for all blood, organ 
and semen donors and education of AIDS high-risk 
groups regarding transmission of the disease. 

Experimental or early-stage clinical treat- 
ment of AIDS and ARCS conditions have included the 
25 administration of antiviral drugs, such as HPA-23, 

phospbonoformate. suramin, ribavirin, azi do thymidine 
( " AZT" ) and dideoxycytidine , which apparently inter- 
fere with replication of the virus through reverse 
transcriptase inhibition. Although each of these 
drugs exhibits activity against HIV in vitro, only 
AZT has demonstrated potential benefits in clinical 
trials. AZT administration in effective amounts, 
however, has been accompanied by undesirable and 
debilitating side effects, such as bone marrow 
35 depression. It is likely, therefore, that hemato- 
logic toxicity will b a major rate limiting factor 
in the long term use of AZT. 
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Other proposed methods for treating AIDS 
have focused on the development of agents having 
activity against steps m the viral replicative cycle 
other than reverse transcription. Such methods 
include the administration of interferons or the 
application of hybridoma technology. Most of these 
treatment strategies are expected to require the 
co-administration of immunomodulators . such as inter- 
leuxin-2. 

To date, the need exists for the develop- 
ment of effective immuno therapeutic agents and methods 
for the treatment of AIDS. ARCS , HIV infection and 
other immunodeficiencies caused by T lymphocyte 
depletion or abnormalities. 

15 DISCLOSURE OF THE INVENTION 

The present invention solves the problems 
referred to above by providing, in large amounts, 
soluble T4 and soluble derivatives thereof that act 
as receptors for infective agents whose primary target 
is the T4 surface protein of T4+ lymphocytes. Advan- 
tageously, this invention also provides soluble T4 
essentially free of other proteins of human origin 
and in a form that is not contaminated by viruses, 
such as HIV or hepatitis B virus. 

As will be appreciated from the disclosure 
to follow, the DNA sequences and recombinant DNA 
molecules of this invention are capable of directing, 
in an appropriate host, the production of soluble T4 
or derivatives thereof. The polypeptides of this 
invention are useful, either as produced in the host 
or after further derivatization or modification, in 
a variety of immunotherapeutic compositions and 
methods for treating immunodeficient patients 
suffering from diseases caused by infective agents 
35 whose primary targets are T4+ lymphocytes. According 
to various embodiments of this invention, such compo- 
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sitions and methods relate to a soluble receptor for 
HIV, soluble T4 proteins and polypeptides and anti- 
bodies thereto. The soluble T4 proteins and polypep- 
tides of this invention include monovalent, as veil 
5 as polyvalent forms. 

The compositions and methods of this inven- 
tion, which are based upon soluble T4 proteins, poly- 
peptides or peptides and antibodies thereto, are 
particularly useful for the prevention, treatment or 

10 detection of the Hiv-related infections AIDS and 
ARC. More specifically, the soluble T4-based com- 
positions and methods of this invention employ 
soluble T4-like polypeptides — polypeptides which 
advantageously interfere with the T4/HIV interaction 

15 by blocking or competitive binding mechanisms which 
inhibit HIV infection of cells expressing the T4 
surface protein. These soluble T4-like polypeptides 
inhibit adhesion between T4* lymphocytes and infec- 
tive agents which target T4 + lymphocytes and inhibit 

20 interaction between T4* lymphocytes and antigen pre- 
senting cells and targets of T4* lymphocytes mediated 
killing. By acting as soluble virus receptors, the 
compositions of this invention may be used as anti- 
viral therapeutics to inhibit BIV binding to T4* 

25 cells and virally induced syncytium formation at the 
level of receptor binding. 

This invention accomplishes these goals by 
providing DNA sequences coding on expression in an 
appropriate unicellular host for soluble T4 proteins* 

30 and soluble derivatives thereof. 



* As used in this application, "soluble T4 pro- 
tein", "soluble T4" and "soluble T4-like polypeptides" 
include all proteins, polypeptides and peptides which 
are natural or recombinant soluble T4 proteins, or 
soluble derivatives thereof, and which are charac- 
terized by the immunotherapeutic ( anti -retroviral ) 

(footnote continued on f llowing page) 
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This invention also provides recombinant 
DNA molecules containing those DNA sequence- and 
unicellular hosts transformed with them. Those host- 
permit the production of large quantities of the 
novel soluble T4 proteins, polypeptides, peptide* 
and derivatives of this invention for use m a wide 
variety of therapeutic, prophylactic and diagnostic 
compositions and methods. 

The DNA sequences of this invention are 
selected from the group consisting of: 

<a) the DNA inserts of pl99-7, pBG377 
PBG380, PBG381, p203-5, pBG391, pBG392, pBG393 
PBG394. pBG39S, pBG396. pBC397, p211-ll, p214-10 
and p215-7; 

15 (b) DNA sequences which hybridize to one 

or more of the foregoing DNA inserts and which code 
on expression for a soluble T4-like polypeptide; and 

(c) DNA sequences which code on expression 
for a soluble T4-like polypeptide coded for on 

20 expression by any of the foregoing DNA inserts and 
sequences . 

According to an alternate embodiment, this 
invention also relate, to a DNA sequence comprising 
the DNA insert of P i70-2. said sequence coding on 
25 expression for a T4-like polypeptide. And, this 

invention also relates to recombinant DNA molecules 
and process., for producing T4 protein using that 
DNA sequence. 



30 (footnote continued from preceding page) 
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-icure I is an auioraciocrapr. aepic~~sr 
the purification of T4 protein from U937 cells by 
immunoaf f in:ty chromatography . 

Figure 2 depicts autcraciocrapr. and westerr. 
blot data demonstrating that immunoaf f mity-pun fied . 
solubiiized native T4 protein binds to HIV envelope 
protein. 

Figure 3 depicts the nucleotide sequence 
and the derived amino acid sequence of T4 cDNA 
obtained from PBL clone X203-4. In this figure, the 
amino acids are represented by single letter codes 
as follows: 

Phe: F Leu: L lie: X Met: M 

Val: V Ser: S Pro: P Thr T 

Ala: A Tyr: Y His: H Gin. Q 

Asn: N Lys: K Asp: D Glu. E 

Cys: C Trp: W Arg: R Gly. g 



present. 



= position at which a stop codon is 
in Figure 3. the T4 protein translation 



start ( AA 23 ) is located at the methionine at nucleo- 
ides 201-203 and the mature N-terminus is located at 
the lysine (AA 3 ) at nucleotides 276-278. 

Figure 4 is a schematic outline of the 
construction of cDNA clones P BG312.T4 (also called 

pl71-l) and pl70-2. 

Figure 5 is a schematic outline of the 

construction of plasmid pEClOO. 

Figure 6 depicts amino acid comparisons at 
a positions 3, 64 and 231 of various T4 cDNA clones. 

Figures 7A and 7B depict the protein domain 
structure of purified. solubiiized T4 protein and 
recombinant soluble T4 mutants. 

Figures 8A-8D are schematic outlines of 
constructions of various intermediate plasmids and 
other plasmids used to express recombinant soluble 
T4 ("rsT4" ) of this invention. 



Figure 9A is a schematic outline of the 
construction cf piasatic p!9?-7. 

Figures 9E and 9C are schematic outlines 
of t^e construction of piasmic p203-5. 

Figure 10 depicts the synthetic oligo- 
nucleotide linkers employed m various constructions 
according to this invention. 

Figure 11 depicts the nucleotide sequence 
of the entire plasmid defined by pl99-7 <P,mutet .rsT4 ) 
and its rsT4.2 insert and the amino acid sequence 
deduced from the rsT4 sequence. This includes the 
Clal-CTai cassette which defines the Met perfect 
rsT4.2 coding sequence. 

Figure 12 depicts a protein blot analysis 
of an induction of rsT4.2 expression from 
SG936/pl99-7. 

Figure 13 is a schematic outline of the 
construction of plasmid pBC368. 

Figures 14A-14C are schematic outlines of 
constructions of various plasmids of this invention. 

Figure 15 depicts the nucleotide sequence 
of plasmid pBG391. 

Figure 16 depicts the nucleotide sequence 
of plasmid pBG392. In this figure, the T4 protein 
translation start (AA_ 23 ) is located at the methio- 
nine at nucleotides 1207-1209 and the mature 
N-terminus i. located at the lysine ( AA_ ) at nucleo- 
tide 1281-84. 

Figure 17 is a schematic outline of con- 
structions of various plasmids of this invention. 

Figure 18 depicts the synthetic oligonucleo- 
tide linkers employed in various constructions 
according to this invention. 

Figure 19 depicts the nucleotide sequence 
of plasmid pBG394. 

Figure 20 depicts the nucleotide sequence 
of plasmid pBG396. 
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Figure 21 depicts the nucleotide sequence 

of piasmic pBG2 93 . 

Figure 22 depicts the nucleotide sequence 

of piasmic pBG39S. 

Figure 23 is a Coomassie stained ge: o: 
rsT4 2 purified from the conditioned medium of the 
PBG380 transfected CHO cell line BG380G of piasmic 
pl96-10. 

Figure 24 is a schematic outline of the 
10 construction of plasmid pl96-10. 

Figure 25 is a schematic outline of the 
construction of plasmid pBG394. 

Figure 26 is a schematic outline of the 

construction of plasmid p211-ll- 
5 Figure 27 is a schematic outline of the 

construction of plasmid p215-7. 

Figure 28 is a schematic outline of the 
construction of plasmid p218-8. 

Figure 29A is a Coomassie stained gel 
20 r«I4. 113.1 purified from the conditioned medium of 
PBG211-11 transfected E.coli. 

Figure 29B is an autoradiograph depicting 
a western blot analysis of rsT4. 113.1 expressed in 

E " C011 * Figure 30, panels (a)-(c) depict the puri- 
fication of rsT4. 113.1 from E^ooU trans formants . 

Figure 31. panels (a)-(c) depict the 
refolding of purified rsT4. 113.1. 

Figure 32 is an autoradiograph depicting 
30 the immunoprecipitation of 35 S-metabolically labelled 
CHO cell lines producing recombinant soluble T4 . 

Figure 33 depicts an immunoblot analysis 
of COS 7 cell lines producing recombinant soluble T4 - 
Figure 34 depicts in graphic form the 
35 results of a competition assay between r.T4. 113.1 
and rsT4.3 for binding to OKT4A or 0KT4. 
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Figures 2 5-37 depict in graphic form the 
results of compeutior. assays between rs74.li: and 
rsT4.3 for binding to. respectively. 0K74A. Leu- 3 a 
and OKT4 . 

Figure 36 depicts m graphic form ar. ZLISA 
assay for rsT4.113.I from E.coli trans formants . 

Figure 3 9 depicts m graphic form the 
results of a p24 radioimmunoassay using recombinant 
soluble T4 according to this invention. 

Figures 40 and 41 depict the results of 
syncytia inhibition assays using recombinant soluble 
T4 proteins according to this invention. 

Figure 42 is a schematic outline of the 
construction of plasmid pBiv.l. 

Figure 43 depicts the bivalent recombinant 
soluble T4 protein produced by pBiv.l. 

DETATr.m DESCRIPTION OF THE INVENTION 

w« isolated the DNA sequences of this 
invention from two libraries: a Agt cDNA library 
derived the T cell tumor line REX and a \gtl0 cDNA 
library derived from peripheral blood lymphocytes. 
However, we could also have employed libraries pre- 
pared from other cells that express T4. These 
include, for example, H9 and U937. We also used a 
human genomic bank to isolate various fragments of 
the T4 gene. 

For screening these libraries, we used a 
series of chemically synthesized anti-sense oligo- 
nucleotide DNA probes based upon the T4 protein 
sequence set forth in Maddon et al . (1985), supra. 

For screening, we hybridized our oligo- 
nucleotide probes to our cONA libraries utilizing a 
plaque hybridization screening assay. We selected 
clones hybridizing to several of our probes. And. 
after isolating and subcloning the cDNA inserts of 
the sel cted clones int plasmids, w d termined 



their nucleoside sequences and compared the amine 
acid sequences deduced from these nucleotide sequences 
to the ammo acid sequences referred to m Madder, 
et al . (iSeS). supra. As a result of these compari- 
5 sons, we determined that all of our selected clones 
were characterized by cDNA inserts coding for amine 
acid sequences of human T4 . 

We have depicted in Figure 3 the nucleo- 
tide sequence of full-length T4 cDNA obtained from 
10 deposited clone pl70-2 and the amino acid sequence 
deduced therefrom. That cDNA sequence was subse- 
quently subjected to in vitro site-directed mutagen- 
esis and restriction fragment substitution so that 
its cDNA sequence was identical to that of Maddon 
15 et al. 

After modifying our T4 cDNA sequence to be 
identical to that of Maddon et al . , we truncated 
samples of it in various positions to remove the 
coding regions for tfte transmembrane and intracyto- 
20 plasmic domains. The remaining cDNA sequences encoded 
a soluble T4 which retained the extracellular region 
believed to be responsible for HIV binding. 

We then constructed various clones charac- 
terized by such cDNA inserts coding for human soluble 
25 T4. Those cDNA sequences may be used in a variety 
of ways in accordance with this invention. More 
particularly, those sequences or portions of them, 
or synthetic or semi -synthetic copies of them, may 
be used as DNA probes to screen other human or animal 
30 cDNA or genomic libraries to select by hybridization 
other DNA sequences that are related to soluble T4 . 
Typically, conventional hybridization conditions, 
e.g., about 20° to 27°C below Tm, are employed in 
such selections. However, less stringent conditions 
3 5 may be necessary when the library is being screened 
with a probe from a different species than that from 
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which the library is derived, e.g., the screening cf 
a mouse library with a human probe. 

Such cDNA inserts, portions of them, or 
synthetic or semi -synthetic copies of them, may also 
be used as starting materials to prepare various 
mutations. such mutations may be either degenerate, 
i.e., the mutation does not change the amino acid 
sequence encoded by the mutated codon, or non- 
degenerate, i.e., the mutation changes the amino 
acid sequence encoded by the mutated codon. Both 
types of mutations may be advantageous in producing 
or using soluble T4 • s according to this invention. 
For example, these mutations may permit higher levels 
of production or easier purification of soluble T4 
or higher T4 activity. 

For all of these reasons, the DNA sequences 
of this invention are selected from the group con- 
sisting, of: 

(a) the DNA inserts of pl99-7, pBG377, 
PBG380, PBG381, p203-5, pBG391. pBG392, pBG393, pBC394, 
PBG395, pBG396, pBG397, p211.ll, p214-10 and p215-7; 

(b) DNA sequences which hybridize to one 
or more of the foregoing DNA inserts and which code 
on expression for a soluble T4-like polypeptide; and 

(c) DNA sequences which code on expres- 
sion for a soluble T4-like polypeptide coded for on 
expression by any of the foregoing DNA inserts and 
sequences . 

Preferably, the DNA sequences of this 
invention code for a polypeptide selected from the 
group consisting of a polypeptide of the formula 
AA -23" AA 362 of Fi °Aire 3, a polypeptide of t formula 
AA l-362 of Fi 9u r « 3 . a polypeptide of the ft la 
Met-AA 1 _ 362 of Figure 3, a polypeptide of th armula 
AA l-374 of Fi W« 3. a polypeptide of the fo .a 
Met-AA 1-3?4 of Figure 3, a polypeptide of tr rmula 
AA l-377 of Fi 9u« 3, a polypeptide of the f .a 
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Met-AA- -~ of Figure 3. a polypeptide off the formui* 
AA ,-AA J _;: of Figure 2. a polypeptide of the fonul. 
AA -Aa!~ of Figure 2, or portions thereof. 

~ 23 J/ DNA sequences according to this invention 
* also preferably code for a polypeptide selectee from 
the group consisting of a polypeptide of the formula 
AA „-AA. e2 of Figure 16. a polypeptide of the 
formula AA,-AA 182 of Figure 16. a polypeptide of 
the formula Met-AA,.^ of Figure 16, a polypeptide 
10 of the formula AA. 23 -AA 182 of Figure 16, followed by 

the amino acids asparagine-leucine-glutamme-histidme- 

serine-leucine, a polypeptide of the formula 

AA, -AA, a5 of Figure 16, followed by the amino acids 

asparagine-leucine-glutamine-histidine-serine-leucine, 

IS a polypeptide of the formula N«t-AA 1 . 182 of Figure 16. 
followed by the amino acids asparagine-leucme- 
glutamine-histidine-serine- leucine, a polypeptide of 
the formula AA.^-AA^ of Figure 16, a polypeptide 
of the formula AA,-AA 113 of Figure 16. a polypeptide 
20 of the formula Met-AA,.^ of Figure 16, a polypeptide 
of the formula AA.^-AA^ of Figure 16, a polypeptide 
of the formula AA^AA^ of Figure 16. a polypeptide 
of the formula Met-AA^^ of Figure 16. a polypep- 
tide of the formula AA. 23 -AA 131 of Figure 16, a poly- 
25 peptide of the formula AA^AA^ of Figure 16, a 

polypeptide of the formula Met-AA^^ of Figure 16. 
a polypeptide of the formula AA_ 23 -AA 145 of Figure 16. 
a polypeptide of the formula AA^AA^ of Figure 16. 
a polypeptide of the formula Met-AA^^g of Figure 16. 
30 a polypeptide of the formula AA_ 23 -AA 166 of Figure 16, 
a polypeptide of the formula AA^AA^ of Figure 16. 
a polypeptide of the formula Htt-AA^j^ of Figure .6. 

or portion* thereof. 

Additionally, DNA sequences of this mven- 
35 tion code for a polypeptide selected from the group 

consisting of a polypeptide of the formula AA_ 23 -AA 362 
of matur T4 protein, a polypeptid of th formula 
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^1-362 of fflatu ^ T4 protein, a polypeptide of the 
formula Met-AA._ 362 of mature T4 protein, a polypep- 
tide of the formula AA._ 374 of mature T4 protein, a 
polypeptide of the formula Met-AA^ of mature T4 
protein, a polypeptide of the formula AA. . . of 
mature T4 protein, a polypeptide of the formula' 
Met-AA 1 _ 37? of mature T4 protein, a polypeptide o* 
the formula AA. 23 -AA 374 of mature T4 protein, a poly- 
peptide of the formula AA_ 23 -AA 37? of mature T4 pro- 
tein, or portions thereof. 

DNA sequences according to this invention 
also code for a polypeptide selected from the group 
consisting of a polypeptide of the formula AA -AA 
of mature T4 protein, a polypeptide of the formula 182 
AA 1 -AA 182 of mature T4 protein, a polypeptide of the 
formula Met-AA^^ of mature T4 protein , a polypep . 
tide of the formula AA.^-aa^ of mature T4 protein, 
followed by the amino acids asparagine-leucme- 
glutamine-histidine-serine-leucine, a polypeptide of « 
the formula AA.-AA^ of mature T4 protein, followed 
by the amino acids asparagine-leucine-glutamine- 
hi.tidine-serine-leucine, a polypeptide of the 
formula Met-AA^^ of mature T4 protein, followed 
by the amino acids asparagine-leucine-glutamine- 
histidine-.erine-leucine. a polypeptide of the 
formula AA.^-AA^ of mature T4 protein, a polypep- 
tide of the formula AA^AA^ of mature T4 protein, 
* Polypeptide of the formula Met-AA^^ of mature 
T4 protein, a polypeptide of the formula AA -AA 
of mature T4 protein, a polypeptide of the formula" 
^l'^ll of m *ture T4 protein, a polypeptide of the 
formula Met-AA^^ of mature T4 protein, a polypep- 
tide of the formula AA. 23 -AA 131 of mature T4 protein, 
a polypeptide of the formula AAj-AA^ of mature T4 
protein, a polypeptide of the formula Met-AA of 
matur T4 protein, a polypeptid of the formula^ 1 
^-23 '^145 of matur T « Protein, a p lypeptide of 



15 



20 



30 



the formula AA. -AA^ C of mature T4 prote:r- r a polypep- 
tide cf the fcrmula Met-AA. c of mature T4 protein, 
a polypeptide of the formula AA . 2 -* AA 166 of mature 
T4 protein, a polypeptide of the formula AA -AA 
5 of mature 74 protein, a polypeptide of the formula 

Met-AA ;-i66 of mature T4 protein, or portions thereof 

The amino terminal amino acid of mature 
T4 protein isolated from T cells begins at lysine, 
the third amino acid of the sequence depicted in 
10 Figure 16. Accordingly, soluble T4 proteins also 
include polypeptides of the formula AA 3 -AA 3?7 of 
Figure 16. or portions thereof. Such polypeptides 
include polypeptides selected from the group consist- 
ing of a polypeptide of the formula AA 3 to AA 362 of 
Figure 16, a polypeptide of the formula AA 3 to AA 3?4 
of Figure 16. a polypeptide of the formula AA 3 -AA 1Q2 
of Figure 16, a polypeptide of the formula AA 3 -AA 113 
of Figure 16. a polypeptide of the formula AA 3 -AA 131 

of Figure 16, a polypeptide of the formula „ c 

j 14 5 

of Figure 16, a polypeptide of the formula AA--AA W 
, 3 16o 

of Figure 16, and a polypeptide of the formula 

^3 of figure 16. Soluble T4 proteins also 
include the above-recited polypeptides preceded by 
an N- terminal methionine group. 
25 Soluble T4 protein constructs according to 

this invention may also be produced by truncating 
the full length T4 protein sequence at various posi- 
tions to remove the coding regions for the transmem- 
brane and intracytoplasmic domains, while retaining 
the extracellular region believed to be responsible 
for HIV binding.. More particularly, soluble T4 
polypeptides may be produced by conventional tech- 
niques of oligonucleotide directed mutagenesis; 
restriction digestion, followed by insertion of 
3 5 linkers; or chewing back full length T4 protein with 
enzymes . 
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Altematively , soluble T4 polypeptides may 
be chemically synthesized by conventional peptide 
synthesis techniques, such as solid phase synthesis 
[R. B. Merrifield, "Solid Phase Peptide Synthesis. 
I. The Synthesis Of A Tetrapeptide", j . Am . Chem ■ 
Soc ■ , 83, pp. 2149-54 (1963)]. 

The DNA sequences of this invention code 
for soluble proteins and derivatives that are believed 
to bind to Major Histocompatibility Complex antigens 
and envelope glycoprotein of certain retroviruses, 
such as HIV. Preferably, they also inhibit syncytium 
formation, believed to be the mode of intracellular 
HIV virus spread. And, they may inhibit interaction 
between T4 + lymphocytes and antigen-presenting cells 
15 and taxgets of T4* cell mediated killing. Most 

preferably, they also inhibit adhesion between T4* 
lymphocytes and infective agents, such as the HIV 
virus, whose primary targets axe T4* lymphocytes. 

The DNA sequences of this invention are 
also useful for producing soluble T4 or its deriva- 
tives coded for on expression by them in unicellular 
hosts transformed with those DNA sequences. As well 
known in the. art, for expression of the DNA sequences 
of this invention, the DNA sequence should be opera- 
tively linked to an expression control sequence in 
an appropriate expression vector and employed in 
that expression vector to transform an appropriate 
unicellular host. 

Such operative linking of a DNA sequence 
of this invention to an expression control sequence, 
of course, includes the provision of a translation 
start signal in the correct reading frame upstream 
of the DNA sequence. If the particular DNA sequence 
of this invention being expressed does not begin 
35 with a methionine, th start signal will result in 
an additional amino acid — methionine — being 
located at the N-terminus of the product. While 
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such methionyl -containing product may be employed 
directly in the compositions and methods of this 
invention, it is usually more desirable to remove 
the methionine before use. Methods are available in 
the art to remove such N-termmal methionines from 
polypeptides expressed with them. For example, 
certain hosts and fermentation conditions permit 
removal of substantially all of the N-terminal 
methionine in vivo. Other hosts require in vitro 
removal of the N-terminal methionine. However, such 
in vivo and in vitro methods are well known in the 
art. 



A wide variety of host/expression vector 
combinations may be employed in expressing the DNA 
15 sequences of this invention. Useful expression 
vectors, for example, may consist of segments of 
chromosomal, non-chromosomal and synthetic DNA 
sequences, such as various known derivatives of SV40 
and known bacterial plasmids. e.g., plasmids from 
E.eoli including col El, pCRl, pBR322, pMB9 and their 
derivatives, wider host range plasmids, e.g., RP4, 
phage DNAa , e.g., the numerous derivatives of phage \. 
e.g.. NM989 , and other DNA phages , e.g.. M13 and 
filamenteous single stranded DNA phages, yeast plas- 
25 mids, such as the 2p plasmid or derivatives thereof, 
and vectors derived from combinations of plasmids 
and phage DNAs. such as plasmids which have been 
modified to employ phage DNA or other expression 
control sequences. For animal cell expression, we 
prefer to use plasmid pBG368. a derivative of pBG312 
[R. cate et al . , "Isolation Of The Bovine And Human 
Genes For Mullerian Inhibiting Substance And 
Expression Of The Human Gene In Animal Cells", Cell, 
45, pp. 685-98 (1986)] which contains the major late 
35 promoter of adenovirus 2. 

in addition, any of a wide variety f 
expression control sequenc s — sequenc s that con- 
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trol the expression of a DNA sequence when opera- 
tively linked to it may be usee in these vectors 
to express the DNA sequence of this invention. Such 
useful expression control sequences, include, for 
5 example, the early and late promoters cf SV4C or the 
adenovirus, the lac system, the trj system, the TAC 
or TRC system, the major operator and promoter regions 
of phage A, the control regions of fd coat protein, 
the promoter for 3-phosphoglycerate kinase or other 
10 glycolytic enzymes, the promoters of acid phosphatase, 
e.g., Pho5, the promoters of the yeast o -mating 
factors, the polyhedron promoter of the baculovirus 
system and other sequences known to control the 
expression of genes of prokaryotic or eukaryotic 
cells or their viruses, and various combinations 
thereof. For animal cell expression, we prefer to 
use an expression control sequence derived from the 
major late promoter of adenovirus 2. 

A wide variety of unicellular host cells 
are also useful in expressing the DNA sequences of 
this invention. These hosts may include well known 
eukaryotic and prokaryotic hosts, such as strains of 
E - coli ' Pscudomonas . Bacillus . Streptomvces . fungi, 
such as yeasts, and animal cells, such as CHO and 
25 mouse cells, African green monkey cells, such as 

COS 1, COS 7, BSC 1. BSC 40, and BWT 10, insect cells, 
and human cells and plant cells in tissue culture. 
For animal cell expression, we prefer CHO cells and 
COS 7 cells. 

It should of course be understood that not 
all vectors and expression control sequences will 
function equally well to express the DNA sequences 
of this invention. Neither will all hosts function 
equally well with the same expression system. How- 
ever, one of skill in the art may make a selection 
among these vectors, expression control sequences, 
and hosts without undue experimentation and without 
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departing from the scope of this invention. For 
example, in selecting s vector, the host must be 
considered because the vector must replicate m it- 
The vector's copy number, the ability tc control 
that copy number, and the expression of any other 
proteins' encoded by the vector, such as antibiotic 
markers, should also be considered. 

in selecting an expression control sequence, 
a variety of factors should also be considered. 
These include, for example, the relative strength of 
the system, its controllability, and its compatibility 
with the particular DNA sequence of this invention, 
particularly as regards potential secondary struc- 
tures. Unicellular hosts should be selected by 
consideration of their compatibility with the chosen 
vector, the toxicity of the product coded for on 
expression by the DNA sequences of this invention to 
them, their secretion characteristics, their ability 
to fold proteins correctly, their fermentation re- 
quirements, and the ease of purification of the 
products coded on expression by the DNA sequences of 

this invention. 

Within these parameters, one of skill in 
the art may select various vector/expression control 
25 system/host combinations that will express the DNA 

sequence, of this invention on fermentation or in large 
scale animal culture, e.g., CHO cells or COS 7 cells. 

The polypeptides produced on expression of 
tn. DNA sequences of this invention may be isolated 
from the fermentation or animal cell cultures and 
purified using any of a variety of conventional 
methods. One of skill in the art may select the 
most appropriate isolation and purification tech- 
niques without departing from the scope of this 
35 invention. 

The polypeptides produced on expression of 
the DNA sequences of this invention are essentially 
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free of ether proteins of human origin. Thus, they 
are different than T4 protein purified from human 
lymphocytes . 

The polypeptides of this invention are 
5 useful in immunotherapeutic compositions and methods. 
For example, the polypeptides of this invention are 
active in inhibiting infection by agents whose primar 
targets are T4* lymphocytes by interfering with their 
interaction with those target lymphocytes. More 

10 preferably, the polypeptides of this invention may 

be employed to saturate the T4 receptor sites of T4- 
targeted infective agents. Thus, they exert anti- 
viral activity by competitive binding with cell 
surface T4 receptor sites. This effect is plainly 

IS of great utility in diseases, such as AIDS, ARC and 
HIV infection. Accordingly, the polypeptides and 
methods of this invention may be used to treat humans 
having AIDS, ARC, HIV infection or antibodies to 
HIV. In addition, these polypeptides and methods 

20 may be used for treating AIDS-like diseases caused 
by retroviruses, such as simian immunodeficiency 
viruses, in mammals, including humans. 

According to one embodiment of this inven- 
tion, antibodies to soluble T4 proteins and polypep- 

25 tides may be used in the treatment, prevention, or 
diagnosis of AIDS, ARC and HIV infection. 

The polypeptides of this invention may 
also be used in combination with other therapeutics 
used in the treatment of AIDS, ARC and HIV infection. 

30 For example, soluble T4 polypeptides may be used in 
combination with anti -retroviral agents that block 
reverse transcriptase, such as AZT, HPA-23 , phos- 
phonoformate, suramin, ribavirin and dideoxyciti- 
dine. Additionally, these polypeptides may be used 

35 with anti -viral agents such as interferons, includ- 
ing alpha interferon, beta interferon and gamma 
interferon, or glucosidase inhibitors, such as 
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castanosperame. Such combination therapies advan- 
tageously utilize lower dosages of these agents, 
thus avoiding possible toxicity. 

And, the polypeptides of this invention 
5 may be used in plasmapheresis techniques or :r. blood 
bags for selective removal of viral contaminants 
from blood. According to this embodiment of the 
invention, soluble T4 polypeptides may be coupled to 
a solid support, comprising, for example, plastic or 
10 glass beads, or a filter, which is incorporated into 
a plasmapheresis unit. 

Additionally, the compositions of this 
invention may be employed as immunosuppressants use- 
ful in preventing or treating graf t-vs-host disease. 
15 autoimmune diseases and allograft rejection. 

The compositions of this invention typi- 
cally comprise an immunotherapeutic effective amount 
of a polypeptide of this invention and a pharmaceu- 
tically acceptable carrier. Therapeutic methods of 
20 this invention comprise the step of treating patients 
in a pharmaceutical^ acceptable manner with those 

compositions . 

The compositions of this invention for use 
in these therapies may be in a variety of forms. 
25 These include, for example, solid, semi-solid and 

liquid dosage forms, such as tablets, pills, powders, 
liquid solutions or suspensions, liposomes, supposi- 
tories, injectable and infusable solutions. The 
preferred form depends on the intended mode of admin- 
30 istration and therapeutic application. The composi- 
tions also preferably include conventional pharma- 
ceutical^ acceptable carriers and adjuvants which 
are known to those of skill in the art. 

Generally, the pharmaceutical compositions 
35 of the present invention may be formulated and admin- 
istered using methods and compositions similar to 
those used for other pharmaceutical^ important poly- 
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peptides (e.g., alpha-inter f eron ) . Thus, the poly- 
peptides may be stored in lyophiiizec form, reconsti- 
tuted with sterile water just prior to adiran: strati or., 
and administered by the usual routes of adminis tra-ior. 
such as parenteral, subcutaneous, intravenous, intra- 
muscular or- intralesional routes. An effective dosage 
may be in the range of from 0.5 to 5.0 mg/kg body 
weight/day, it being recognized that lower and higher 
doses may also be useful. 

This invention also relates to soluble 
receptors and their use in diagnosing or treating 
viral agents which target or bind to those receptors. 
Such soluble receptors may be used as decoys to 
absorb viral agents and to halt the spread of viral 
15 infection. Alternatively, virus-killing agents may 
be attached to the soluble protein receptors, 
providing a direct mode of delivery of those agents 
to the virus. 

More particularly, the polypeptides of 
this invention are useful in diagnostic compositions 
and methods to detect or monitor the course of HIV 
infection. Advantageously, these polypeptides are 
useful in diagnosing variants of the HIV virus, 
regardless of origin of the infecting HIV agent. 
25 Fo * example, soluble T4 proteins and poly- 

peptides according to this invention, which have a 
nigh affinity for HIV, may be advantageously used to 
increase «ensitivity of HIV assay systems now 

based upon monoclonal or polyclonal antibodies. 
30 More specifically, soluble T4 proteins and polypep- 
tides may be used to pretreat test plasma to concen- 
trate any HIV present, even in small amounts, so 
that it is more easily recognized by the antibody. 
And soluble T4 proteins and polypeptides may be used 
35 to purify the HIV envelope protein gpl20. 

Alternatively, th soluble T4 proteins and 
polypeptides of this invention may be used t replace 
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anti-EIV antibodies now used in various assays. 
These soluble T4 proteins and polypeptides are be 
preferable to anti-HIV antibodies for two reasons. 
First, soluble T4 , exhibits an affinity for HIV of 
5 approximately 10~ 9 , a level which exceeds the 10* 7 
to 10" values of anti-HIV antibodies. And, while 
anti-HIV antibodies are more likely to be specific 
for different HIV isolates, strain variations would 
not affect a soluble T4 protein-based assay, since 

10 all HIV isolates must be capable of interacting with 
the T4 receptor as a prerequisite to infectivity. 

For example, a soluble T4 protein or poly- 
peptide may be linked to an indicator, such as an 
enzyme, and used in an ELISA assay. Here, soluble 

15 T4 advantageously acta as a measure of both HIV in a 
test sample and any free HIV envelope gpl20 protein. 

And, polyvalent forms of soluble T4 proteins 
or polypeptides may be produced, for example, by 
chemical coupling or genetic fusion techniques, thus 

20 increasing even further the avidity of soluble T4 
for HIV. 

In order that tliis invention may be better 
understood, the following examples are set forth. 
These examples are for purposes of illustration only, 
25 and are not to be construed as limiting the scope of 
the invention in any manner. 

EXAMPLES 

Purification Of Native Solubili2ed T4 

We purified native T4 from the T4*-promono- 
30 cytic cell line U937 derived from a histocytic 

lymphoma to approximately 50* purity usir.g immuno- 
affinity chromatography as follows. 

We grew U937 cells [a gift from Dr. Scott 
Hammer, New England Deaconess Hospital] to 
35 10 6 cells/ml in RPMI 1640, 10% FCS, harv sted and 

washed them in IX PBS. We then lys d th cell pellet 
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in 20 mM Tris-HCl (pE 7.7), 0.5% NP-40 (a non-iomc 
detergent), 0.2% NaDOC , 0.2 mM EGTA. 0.2 mK PMSF and 
5 pg/ral BPTI at 4 x 10 7 cells/mi. Because this 
purification was carried out m the presence of a 
non-ionic detergent. T4 , which is normally membrane- 
bound via its hydrophobic transmembrane domain, was 
isolated as a solubilized protein. We spun the iysat 
in a GS 3 rotor for 10 min at 10,000 rpm and stored 
the supernatant at -70 °C. 

Subsequently, we preabsorbed the clarified 
cell extract with mouse IgG-Sephaxose, followed by 
protein A Sepharose and then passed the flowthrough 
through an immunoaf finity column comprising immobil- 
ized 19Thy anti-T4 monoclonal antibody on Affigel-10 
[a gift from Dr. Ellis Reinherz, Dana Farber Cancer 
Institute, Boston, Massachusetts]. We washed the 
column extensively and eluted the bound material 
with 50 mM glycine-HCl <pH 2.5), 0.15 M NaCl. 0.5% 
NP-40, 5 ug/ml BPTI and 0.2 mM EGTA. 

We then separated 10 pi aliguots of each 
elution fraction on a 10% SDS-PAGE under reducing 
conditions, with the bands being visualized by silver 
staining. As shown in Figure 1, a major silver- 
stained band of 55 Kd was visible. We then carried 
out two assays on the 55 Kd protein and sequenced 
the amino terminus of the protein to confirm its 
identity as native solubilized T4. 

Sequencing Of Native Solubilized T4 

We determined the N-terminal amino acid 
sequence of our solubilized native T4 which we 
isolated from a detergent extract of U937 cells by 
immunoaf finity chromatography as described above. 

Techniques for determining the amino acid 
sequences of various proteins and peptides derived 
from them are well known in the art. We chose auto- 
mated Edman degradation to determine th amino 
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terminus of our solufcilized native T4 . More speci- 
fically, we gel purified and eiectroeiuted approxi- 
mately 5 pg of the soiubilized native T4 and then 
subjected it tc automated Edman degradation using a 
5 gas* phase sequencer (Applied Biosystems 470A). we 
then identified the PTE-amino acids produced at each 
cycle of the Edman chemistry by high pressure liquid 
chromatography, on-line with the sequencer, in a 
PTH-amino acid analyzer (Applied Biosystems 120A). 
10 Direct analysis of the protein provided amino terminal 
sequence information which, when compared to the 
amino acid sequence deduced from the cDNA sequence 
of human T4 f Maddon et al. (1985), supra], identified 
the purified protein as human T4. 

15 Radioimmunoassay Of Na tive Soiubilized T4 

To determine that our purification process 
enriched for T4, we assayed fractions from the ^ 
immunoaffinity elution step in a T4-specific sandwich 
radioimmunoassay, based upon the ELISA assay of P. E. 
20 Rao et al.. in cellular Im munology, 80, pp. 310-19 
(1983). We coated each well of a Removawell strip 
(Dynatech Lab., Alexandria. Virginia) with 50 pi of 
10 pl/ml OKT4 antibody (ATCC #CRL 8002) or MOPC195 
(a background binding control) in 0.05 M sodium 
25 bicarbonate buffer (pB 9.4) at 4»C overnight. We 

washed the wells and then filled them with 1% FCS in 
PBS to saturate the protein binding capacity of the 
plastic. After removing the 1% FCS solution, we 
added test samples, in 50 pi aliquots, to the wells. 
30 we then incubated the samples for 4 hours at room 
temperature. Subsequently, we removed the samples 
and washed the wells four times with 0.05% Tween-20 
in PBS . we then added 125 I-labelled 19Thy antibody 
(50.000-100,000 cpm per well) and incubated the wells 
35 at 4-C overnight. We then washed the wells four 
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times and separated each well for bound 12 5 1 detect 
m a Beckman gamma detector . 

As shown in Figure 1. in which values were 
plotted following subtraction for background, the 
peak fraction of solubilized native T4 protein 
detected by radioimmunoassay coincided with elution 
of the 55 Kd protein seen by silver staining. 

Western Blot Assay For T4 

Although many antibodies have been developed 
for detecting T4 antigen, none are useful for protein 
blot analysis (Dr. Ellis Reinherz, personal communi- 
cation). In order to develop antibodies useful for 
western blot detection of soluble T4 to follow the 
purification of T4 and recombinant soluble T4, we 
15 raised polyclonal, hyperimmune anti-T4 antisera in 
rabbits against three synthetic T4 oligopeptides. 
These oligopeptides are represented in Figure 3 as 
follows : 

Oligopeptide Amino Acid Coordinates 
20 jb-1 44. 63 

JB-2 133-156 
JB-3 325-343 
we had previously synthesized these peptides using 
conventional phosphoamide DNA synthesis techniques. 
See, e.g., Tetrahedron Letters . 22, pp. 1859-62 
(1981). We synthesized the peptides on an Applied 
Biosystams 380A DNA Synthesizer and purified them by 
gel electrophoresis . 
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<i ) Coupling Of T4 Peptides To BTG 

We coupled each of these peptides to the 
carrier protein bovine thyrogobulin ("BTG") [Sigma, 
St. Louis, Missouri] according to a modification of 
procedures set forth in J. Rothbard et al., j. exp. 
Jted^ 160, pp. 208-21 (1984) and R. C. Kennedy et al.. 
"Antiserum To A Synthetic Peptide Rec gnizes Th 



HTLV-III Envelope Glycoprotein". Science, 231. 

pp. 1556-59 (1986). 

More specifically, we mixed 10 mg of BTG 
diluted in 1 ml of PBS with 1.3 mg of m-maieimido- 
benzoyi-N-hydroxysuccimmide ester ("MBS") m 0.5 ml 
of dimethylformamide ( "DMF" ) . We mixed the reaction 
mixture well and reacted it for about 1 hour at 25°C. 
Subsequently, we loaded the mixture onto a Sephadex 
G25 gel filtration column (Pharmacia. Sweden) which 
had been pre-equilibrated with 0.1 M PBS (pB 6.0). 
We then collected a total of thirty 2 ml aliquot 
elution fractions and read the absorbance of each 
fraction at 280 nm ("A 280 "). We then pooled the 
three peak fractions (15, 16 and 17) to create the 
activated carrier. 

We dissolved 10 mg of NaBH 4 in 2.5 ml of 
0.1 M sodium borate solution to produce a sodium 
borohydride solution. Subsequently, we diluted 
approximately 8 mg of each of synthetic T4 peptides 
JB-1. JB-2 and JB-3 with 1 ml of 0.1 M borate buffer 
and ihen mixed each solution with 200 ul of the sodium 
borohydride solution, incubating the mixture on ice 
for 5 minutes. We then warmed each peptide solution 
to 25°C brought each solution to pH 1.0 with 1 N 
HC1 (during which frothing occurred) and then brought 
each solution to pH 7.0 with 1 N NaOH (after the 
frothing had stopped ) . 

We then coupled each peptide to BTG by 
adding 1.2 ml of the peptide solution to 6 ml of the 
activated carrier solution. We allowed the coupling 
reaction to proceed overnight by incubating the reac- 
tion mixture at room temperature. 

(ii) inoculation Of T eat Animals 

We dissolved each of the BTG-coupled pep- 
5 tides prepared above in sterile Freund's complete 

adjuvant, to a final c ncentration of 1 M9/ml coupled 
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peptide in PBS. Subsequently. w e inoculated each of 
three rabbits (New Zealand white) by intramuscular 
injection of 500 ug of one of the coupled peptides 
into each rabbit. We inoculated a fourth rabbit 
(New Zealand white) in the same manner with a mixture 
of the three coupled peptides. All rabbits were 
prebled prior to boosting to establish an average 
baseline for each response to be measured. The 
rabbits were boosted at 6 weeks with 500 pg coupled 
peptide in incomplete Freund's adjuvant. 

Serum was collected from each rabbit monthly 
for 4 months after immunization. The serum was then 
assayed for antipeptide titer. 

(iii) ELISA With Antipeptide Sera 
15 Against Peptide Coated Plates 

In this assay, we determined that antiserum 
raised in an animal by each of peptides JB-1, JB-2 
and JB-3 binds to that peptide. Accordingly, those 
peptides are immunogenic and elicit a response in 
20 test animals. 

To carry out the assay, we coated Immulon-2 
(Dynatech Labs, Alexandria, Virginia) microti ter 
plates with 50 pi per well of 50 pg/ml uncoupled 
peptide in PBS and incubated the plates overnight at 
25 4°C. Plates coated with peptide 46R*, which served 

as controls, were treated identically. We then washed 
the plates 4 times with PBS-Tween (0.5%) and 4 times 
with water. The plates were blotted dry by gentle 
tapping over paper towels. After blotting the plates, 
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* Peptide 46 corresponds to amino acids ( "AA" ) 
728-751 of the env gene of the HIV genome. The amino 
acid numbering corresponds to that set forth for the 
env gene in L. Ratner et al., "Complete Nucleotide 
35 s guence Of Th AIDS Virus, HTLV-lli", Nature. 313, 
pp. 277-84 (1985). Peptide 46 has the sequence : 
LP I PRGPDRP EG I EEEGGERDRDR . 
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we sdded 200 pi of a 5% FCS/PES solution to each 
well and incubated the plates for I hour at room 
temperature . 

we then assayed serum samples from the 
rabbits on the pre-coated plates prepared as described 
above, we assayed the antibody response to the 
immunogen peptide at an initial dilution of 1:100, 
followed by serial 10-fold dilutions in 5% FCS/FBS. 

After a 2 hour incubation period at room 
temperature, we washed the plates and blotted them 
dry as described above. We then added 50 pi of a 
1:1500 dilution of horseradish peroxidase ( "HRP" )- 
conjugated goat anti-rabbit- IgG [Cooper Biomedical, 
Malvern, Pennsylvania] in S% FCS/FBS to each well 
15 and incubated the plates at room temperature for 

1 hour. We washed the plates with PBS-Tween 0.5%. 
We then added 50 pi of 0.42 mM TMB. We stopped the 
enzyme reactions with 50 pi of 2 M H 2 S0 4 - We then 
analyzed the plates spectrophotometrically at 450 nm 
using a microti ter plate reader [Dynatech Labs, 
Alexandria, Virginia].. 

We observed that antiserum against each of 
peptides JB-1, JB-2 and JB-3 binds to the corre- 
sponding peptide. We also observed that antiserum 
against a mixture of peptides JB-1, JB-2 and JB-3 
binds to peptides JB-1 and JB-3 under the conditions 
mmX forth above. The titers of each of the four 
antisera tested against the peptides in the solid- 
phase ELISA are shown below, where "ND" represents 
30 values not determined: 

Approximate Titer Against: 

Peptide JB-1 JB-2 JB_-3 

JB-1 >l/50,000 0 ND 

J** 2 0 1/50,000 ND 

35 JB ~ 3 0 0 1/10.000 

JB-1 ♦ JB-2 * JB-3 1/4,000 ND 1/7,000 
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Ig fracrions from two of the three anti- 
peptide sera raised against individual peptides, 
anti-JB-1 and anti-JB-2, recognized the 55 Kd T4 
antigen band of native solubilized T4 in a Western 
5 blot analysis of protein eluted from the 197hy 
(anti-T4) monoclonal antibody affinity column 
described above. As in the case of the radioimmuno- 
assay of native solubilized T4, the detection of the 
55 Kd protein coincides with its apparent elution 

10 from the affinity column. This provides further 

evidence that our T4 purification procedure enriched 
for solubilized T4. 

Thus, these polyclonal sera are useful in 
the detection of nanogram quantities of T4 (both 

15 native and recombinant forms) by Western analysis. 

Binding of Cell-Free T4 To HIV Envelope 

We then tested our purified solubilized 
native T4 isolated from U937 cells for its ability 
to bind to the HIV envelope protein gpl60/gpl20. To 

20 carry out this direct binding assay, we incubated 
S-labelled gpl60/gpl20 detergent cell extract 
derived from a recombinant cell line 7d2 (a gift 
from Drs. Mark Kowalski and William Haseltine, Dana- 
Farber Cancer Institute) with samples of solubilized 

25 native T4, each of which had been preincubated with 
one type of monoclonal antibody. 

More specifically, we mixed 5 m1 of solu- 
bilized T4 in a microfuge tube with 5 pg (about 3 pi) 
of OKT4 (ATCC #CRL 8002), a monoclonal antibody 

30 recognizing an epitope on T4 which does not interfere 
with HIV binding [J. A. Hoxie et al. , J. Immunol. , 
136, pp. 361-63 (1986)] or with 5 pg of 0KT4A (Ortho 
Diagnostics #7142), a monoclonal antibody that inter- 
feres with HIV binding to T4 positive cells [J. Steven 

35 McDougal et al., J . Immunol . , 137, pp. 2937*2944 

(1986)]. Alternativ ly, we mixed SO pi of solubilized 
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T4 with 5 of oHTLV III gpl20 (Dupont #NEN-9284). 
we then incubated the mixtures on ice for labour. 

Subsequently, we added ISO pi of S- 
labelled gpl60/gpl20 cell extract or 25 S-labelled 
control cell extract (precleared with protem-A 
Sepharose) to the preincubated solubilized T4/mono- 
clonal antibody mixtures and rocked the tubes over- 
night at 4°C we then precipitated the T4/gpl60/gpl2C 
immune complexes by adding 30 pi of protein-A 
10 Sepharose to each tube and rocking for 2 hours at 

4*C to allow the protein-A Sepharose to bind to the 
antibody complexes. Subsequently, we spun down the 
beads in an Eppendorf microfuge and after extensive 
washings, we eluted with 40 pi SDS sample buffer at 
15 65°C for 10 minutes. We then loaded 20 pi of the 

eluted material on a 7.5% SDS-PAGE gel which was run 
under reducing conditions. 

Figure 2 depicts auto radiograph and Western 
blot results of the T4/gpl60/gpl20 coimmunoprecipita- 
20 tions. in Figure 2. lanes 1-5 were autoradiographed 
after treatment with 40% sodium salicylate and lanes 
6-7 were developed on a Western blot with rabbit 

an ti sera JB-2. 

As shown in Figure 2, gpl60/gpl20 protein 

25 was coimmunoprecipitated in the presence of T4 with 

OKT4. (lan« 5) but not in the presence of T4 with 

OKT4A (lane 4). Lane 3 shows the positive control 

for gpl60/gpl20 using oHTLV III gpl-20 monoclonal 

antibody. Neither negative control with S-labelled 

30 control extract (lane 1) or protein-A Sepharose alone 

(lane 2) showed bands migrating in the position of 

gpl60/gpl20. Based upon the bands that developed 

on the western blot, the amount of T4 precipitated 

with either OKT4 (lane 6) or OKT4A (lane 7) appeared 

35 to be similar. 

This demonstrates that purified, solubilxzed 

nativ T4, which is naturally membrane bound, can 
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still interact with the HIV glycoprotein in solution. 
Accordingly, we believe that cell free soluble T4 is 
useful in preventing the binding interaction between 
HIV and the T4 receptor of T4* lymphocytes. By com- 
peting with cell surface T4 for binding to the HIV 
envelope protein gpl20, soluble T4 is useful in block- 
ing HIV infection. 

Synthesis Of Oligonucleotide DNA Probes 

The nucleotide sequence and a deduced amino 
acid sequence for a cDNA that purportedly encodes 
the entire human T4 protein have been reported 
f Maddon et al . . (1985), supra]. The deduced primary 
structure of the T4 protein reveals that it can be 
divided into domains as demonstrated below: 

15 _ Amino Acid 

Structur e/Proposed Location Coordinates 

Hydrophobic/Secretory Signal -23 to -l 

Homology to V-Regions/ 

Extracellular +i to +94 

20 Homology to J-Regions/ 

Extracellular +95 to +109 

Glycosylated Region/ 

Extracellular +110 to +374 

Hydrophobic/Transmembrane 
25 Sequence +375 to +395 

Very Hydrophilic/ 

Intracytoplaamac +396 to +435 

Based on the sequence for the above-listed 
domains, we chemically synthesized antisense 

30 oligonucleotide DNA probes using conventional phos- 
phoamide DNA synthesis techniques. See, e.g., 
Tetrahed ron Letters . 22, pp. 1859-62 (1981). We 
synthesized the probes on an Applied Biosystems 380A 
DNA synth sizer and purified them by gel electro- 

35 phoresis. 



w O 89/01*40 



PCT/LS88/0r*40 



15 



Furthermore, we synthesized the probes 
such that they were complementary to the DNA 
sequences which code for the amino acid sequence, 
i.e., the probes were antisense. to enable them to 
5 recognize and hybridize to the corresponding sequences 
in DNA, as well as in mRNA. The nucleotide sequences 
of the eleven selected regions of the T4 protein 
[corresponding to the nucleotide numbering set forth 
in Maddon et al . , (1985), supra] were the following: 
10 Nucleotide 
oligonucleotide Coordinates 

! 145-171 

2 742-765 

3 1414-1440 

6 427-453 

7 1303-1329 

8 i012-1038 

9 97-118 

10 10-36 

1X 1698-1724 

12 397-423 

14 261-287 

Before using our DMA probes for screening, 
we 5' end-labelled each of the single- stranded DNA 
probes with 32 P using [ Y - 32 *]-ATP and T4 polynucleo- 
tide kinase, substantially as described by A. H. Maxam 
and w. Gilbert, "A New Method For Sequencing DNA", 
Proc. Natl. Acad. Sci . USA . 74, pp. 560-64 (1977). 

Construction of XgtlO Peripheral Blood 
30 Lvmphocvtes cDN A Library . _ 

To prepare our Peripheral Bio d Lymphocytes 

(PBL) cDNA library, w pr cessed PBL, from a single 

leuXophoresis donor, through one round of absorption 
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to remove monocytes. We then stimulated the non- 
adherent cells with I FN- y 1000 U/ml and 10 pg/mi PEA 
for 24 hours. We isolated RNA from these cells using 
phenol extraction [Maniatis et al . , Molecular Clcninc , 
5 p. 187 (Cold Spring Harbor Laboratory) (1982)] and 

prepared poly A* mRNA by one round of oligo dT cellu- 
lose chromatography. We ethanol precipitated the 
RNA, dried it in a speed vac and resuspended the RNA 
in 10 pi H 2 0 (0.5 pg/pl ) - we treated the RNA for 10 

10 min at room temperature in CH^HgOH (5 mM final con- 
centration) and 3-mercaptoethanol (0.26 M ) . we then 
added the methyl mercury treated RNA to 0.1 M Tris-HCl 
(PH 8.3) at 43°C, 0.01 M Mg, 0.01 M DTT. 2 mM Vanadyl 
complex, 5 pg oligo <*T 12 _ 18> 20 mM KC1, 1 mM dCTP, 

15 dGTP , dTTP, 0.5 mM dATP. 2 M Ci [o- 32 P]dATP and 30 U 

1.5 pi AMV reverse transcriptase (Seikagaku America) 
in a total volume of 50 pi. We incubated the mixture 
for 3 minutes at room temperature and then for 3 hours 
at 44°C, after which time we stopped the reaction by 

20 the addition of 2.5 pi of 0.5 M EDTA. 

We extracted the reaction mixture with an 
equal volume of phenol : chloroform (1:1) and precipi- 
tated the aqueous layer two times with 0.2 volume of 
10 M NB 4 AC and 2.5 volumes EtOH and dried it under 

25 vacuum. The yield of cONA was 1.5 M g. 

We synthesized the second strand according 
to the methods of Okayama and Berg [ Mol. Cell. Biol. . 
2, p. 161 (1982)] and Gubler and Hoffman [ Gene . 25, 
pp. 263-69 (1983)], except that we used the DNA poly- 

30 merase I large fragment in the synthesis. 

We blunt ended the double-stranded cDNA by 
resuapending the DNA in 80 pi TA buffer (0.033 M Tris 
Acetate (pH 7.8); 0.066 M KAcetate; 0.01 M MgAcetate; 
0.001M DTT; 50 pg/ml BSA) , 5 pg RNase A, 4 units RNase 

35 H, 50 pM 0 NAD , 8 units E.coli ligase, 0.3125 mM 
dATP, dCTP, dGTP, and dTTP, 12 units T 4 polymerase 
and incubated the reacti n mixture for 90 min at 



v\ O 89/01940 



PCT L 588,0:--H> 



-39- 

37«C. added 1/20 volume of 0 . 5M EDTA, and extracted 
with phenol: chloroform. We chxomatographed the 
aqueous layer on a G150 Sephadex column in 0.01M 
Tris-HCl (pH 7.5). 0.1 M NaCl. 0.001 M EDTA and 
5 collected the lead peak containing the double-strandec 
cDNA and ethanol precipitated it. Yield: 0.605 vg 
CDNA. 

We ligated the double- stranded cDNA to 

linker 35/36: 
10 5 ' AATTCGAGCTCGAGCGCGGCCGC3 ' 

3 • GCTCGAGCTCGCGC CGGCG5 ' 
using standard procedures. We then size selected 
the cDNA for 800 bp and longer fragments on a S500 
Sephacryl column, and ligated it to EcoR I -digested 
15 bacteriophage lambda vector gtlO (a gift of 

Dr. Ellis Reinherz). We packaged aliquots of the 
ligation reaction in Gigapak (Strategene) according 
to the manufacturer's protocol. We used the packaged 
phage to infect E.coli BNN102 cells and plated the 
20 cells for amplification. The resulting library con- 
tained 1.125 x 10 6 independent recombinants. 

we also screened a PBL cDNA library in the 
bacteriophage lambda vector gtlO (a gift of Dr. Ellis 
Reinherz), which was synthesized from mRNA from a 
25 T4* tumor cell line named REX. which expresses T4 

protein at high levels [O. Acuto et al., "The Human 
T Cell Receptor: Appearance In Ontogeny And 
Biochemical Relationship Of Lambda and Beta Subunits 
on IL-2 Dependent Clones And T Cell Tumors", Cell, 
30 34, pp. 717-26 (1983)]. 

Screening Of The Libraries 

we then used three of our 32 P-labelled 
synthetic oligonucleotide antisense probes, probes 3, 
6 and 9. to screen in parallel our two XgtlO cDNA 
35 libraries using th plaque hybridization screening 

technique described in R. Cate et al., "Isolation Of 
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The Bovine And Human Genes For Mulienan Inhibiting 
Substance And Expression Of The Human Gene In Animal 
Cells", Cell , 45, pp. 685-98 (1986), with minor 
modifications. We modified the Cate et al . proce- 
5 dure by hybridizing without tetramethyl ammonium 
chloride to accommodate our use of unique probes, 
rather than mixtures, to probe the plaque filters. 

We used the three probes, which had been 
previously 5' end-labelled with [y- 32 F]-ATP according 

10 to the method of A. Max am and W. Gilbert, Meth. 

Enzvmol . , 68, pp. 499-80 (1979) to screen in parallel 
the PBL cDNA library and the REX cDNA library dis- 
cussed above. 

From our screening of the PBL library, we 

15 isolated a nearly full length soluble T4 cDNA clone 
A203-4 (or Agtl0.FBL.T4) — containing a 3.064 kb 
insert which could be cleaved from the AgtlO vector 
with Eco RI . 

From our screening of the REX cell library, 

20 we isolated an incomplete T4 cDNA clone containing 

a 1,200 bp cDHA insert. We then further characterized 
the DNA from these clones by DNA sequencing analysis. 

We also screened a bacteriophage lambda 
human genomic library, constructed in the vector 

25 EMBL3 by Dr. Mark Pasek (Biogen Inc., Cambridge, 

Massachusetts) [N. Murray in Lambda 2, eds. R. Hendrix, 
J. Roberts, F. Stahl, R. Weiaberg, pp. 3935-422 (1983)]. 
Th# library contains DNA fragments, created by partial 
restriction of chromosomal DNA from the human lympho- 

30 blastid cell line GM1416,48, XXXX (Human Genetic 
Mutant Cell Repository, Camden, New Jersey) with 
Sau3a, ligated onto EMBL3 arms which had been sub- 
jected to cleavage with BamHI according to the pro- 
cedures outlined in Maniatis et al. . (1982), supra. 

35 Plating of the phag library, lysis, and transfer of 
the phag DNA ont nitrocellulose wer performed as 
describ d by W. D. Bent n and R. W. David, "Screening 
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of Lambda gt Recombinant Clones By Hybridization To 
Single Plagues In Situ", Science. 196, p. 180 (1977) 
and Maniatis et al . (1982). Hybridization conditions 
were those described by Cate et al ■ (1986), supra, 
except that tetrame thy I ammonium chloride (TMACl) was 
omitted from the washing buffer. 

Approximately 2 million plaques were 
screened in parallel hybridizations with probe 1 
and probe 3 discussed above. One phage, called 
CM47, which hybridized with probe 3 in the primary 
screenings, was subjected to DNA sequence analysis 
to determine the existence and position of an intron 
between the coding sequences for the predicted extra- 
cellular and transmembrane domains. No phage clones 
15 containing T4 sequences were found screening with 
probe 1. probably because it includes a sequence 
interrupted by an intron [D. R. Littman and S. N. 
Gettner, Nature , 325, pp. 453-55 (1987); and our 

observations] ■ 

Partial sequence analysis of CM47 shows 
that an intron interrupts the sequence corresponding 
to the codon for valine (amino acid 363) of the 
deduced primary sequence for T4 (Figure 3 — in which 
introns are indicated by a solid line). This intron 
25 defines a potential site for introducing a stop codon 
in order to express a soluble form of T4. Another 
intron found within the coding sequence for T4 inter- 
rupts the codon for arginine (amino acid 295) and a 
third intron in CM47 is found between the codons for 
arginine (amino acid 402) and arginine (amino acid 
403) (Figure 3). 
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sequencing Of cDNA Clones 

We then subcloned EcoRI digested DNA from 
clone \203-4 into animal expression vect r pBG312 
35 "» ote et al. . supra) to facilitate sequence 

analysis. More specifically, as depicted in Figure 4. 
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we then digested AgtlO.PBL.T4 with EcoRi to excise 
the 3.064 kbp EcoRi -EcoRi fragment containing the 
full length T4 cDNA. This cDNA sequence, including 
the entire coding region for soluble T4 and for full 
length T4 was deposited in pl70-2. We used T4 ligase 
to ligate the fragment into animal expression vector 
pBG312 [supra] which had been previously cut with 
EcoRi, to form pBG312.T4 and p!70-2 (Figure 4). we 
then determined the nucleotide sequence of the EcoRi 
fragment of pBG312.T4 using Max am Gilbert technology 
[A. M. Maxam and W. Gilbert, "A New Method For 
Sequencing DNA" , Proc. Natl. Acad. Sci. osa 74 , 
pp. 560-64 (1977)] (see Figure 3, which depicts the 
PBL cDNA sequence in comparison to that reported by 
15 Maddon et al., (1985), supra). This analysis showed 
that the 3.064 kbp PBL full length complementary DNA 
copy of T4 cDNA contained the coding sequence for 
T4, approximately 200 bp of 5' noncoding sequence 
and approximately 1500 bp of 3' noncoding sequence . 

We then cut pBG312.T4 with Pstl and removed 
the resulting 3* protruding ends with Klenow and 
isolated an approximately 2.5 kbp fragment. We then 
inserted the fragment into the polylinker of pBG312 
(which had been previously restricted at the sma l 
site) to form plasmid pl70-2, which contains the 
full length PBL T4 cDNA sequence (see Figure 3). 

As depitrted in Figure 3, the PBL T4 cDNA 
contains a nucleotide sequence almost identical to 
the approximately 1,700 bp sequence reported by 
30 Maddon et al., (1985), supra. The PBL T4 cDNA, how- 
ever, contains three nucleotide substitutions that, 
in the translation product of this cDNA, would pro- 
duce a protein containing three amino acid substi- 
tutions compared to the sequence reported by Maddon 
35 et al. As shown in Figure 3, these differenc s are 
at amino acid position 3, wh re the asparagin of 
Maddon et al. is replac d with lysine; position 64, 
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where the tryptophan of Maddon et al . is replaced 
with argmme and at position 221. where the phenyl- 
alanine of Maddon et al . is replaced with serine. 
The asparagine reported at position 3 of Maddon 
5 et al. instead of lysine was the result of a sequenc- 
ing error (Dr. Richard Axel, personal communication). 
The significance of the amino acid replacements at 
positions 64 and 231, which may represent allellic 
polymorphism [T. C. Fuller et al.. Human Immunology, 
10 9. PP- 89-102 (1982); W. Stohl and H. G. Kunkel , 

«^,r^_ J. Immunol. . 20. pp. 273-78 (1984); N. Amino 
et al.. Lancet . 2, pp. 94-95 (1984); and M. Sato 
et al.. J. Immunol. . 132. pp. 1071-73 (1984)]. is 
not known. 

15 DNA sequence analysis fMaxam and Gilbert, 

supra] of the insert in pEClOO of the REX clone sug- 
gests that it represents the product of a splicing 
error, because 5' noncoding sequence appears to have 
.been spliced with coding sequence beginning with the 
GGT codon for glycine (amino acid 49) (see Figure 3 
and Figure 5). The T4 coding sequence in pEClOO* 
from glycine (amino acid 49) to isoleucine (amino 
acid 435) is identical to the sequence of Maddon 
et al. . (198*). supra. 

In comparison, our earlier N- terminal pro- 
tein sequence analysis of native T4 protein purified 
from U937 cells shows a T4 expression product with 
aspaxgine as amino acid 3. These differences are 
also set forth in Figure 6, which also depicts com- 
parisons at corresponding positions of the partial 
clone from the REX cell line XgtlO library; our 
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we constructed pEClOO by digesting the incomplete 
T4 cDHA clone from the REX library with EcoRI and 
Isolating the 1.200 bp cDNA insert. We then ligated 
it t pUC12 (Boehring r Mannheim. Indianapolis , 
indiana) which had been pr viously cut with EcoRI to 
form pEClOO. 
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genomic clone from a \EMBL3 library; mouse T4 
sequences [Tourvieilie et al . , Science . 234, p. 61C 
(1986)] and sheep T4 sequences [Classon et al . , 
I.mmunooenetics , 23, p. 129 (1986)]. 

5 Construction of Soluble T4 Mutants 

We then employed the technique of in vitro 
site-directed mutagenesis and restriction fragment 
substitution to modify the T4 cDNA coding sequence 
of pl70-2 in sequential steps to be identical to 

10 that reported by Maddon et al . . (1985), supra. We 
first used oligonucleotide-directed mutagenesis to 
modify the amino acids at position* 3 and 64. Next, 
we employed restriction fragment substitution with a 
fragment including the serine 231 codon of a partial 

15 T4 cOHA isolated from a T4 positive lymphocyte cell 
line [O. Acuto et al., Cell , 34, pp. 717-26 (1983)] 
library in Agtll (a gift from Dr. Ellis Reinherz), 
to modify the amino acid at position 231. We then 
truncated our modified T4 cDNA sequence to remove 

20, the coding regions for the transmembrane and intra- 
cytoplasmic domains. Subsequently, we constructed 
three different soluble T4 mutants from our full 
length T4 clone PBL T4 by linXer insertion between 
restriction sites in order to increase the probability 

25 of empirically finding a stable, seer ©table T4 mole- 
cule. The structure of each of these mutants is 
depicted in Figure 7A. 

Line A of Figure 7A represents a hydropathy 
analysis of our full length soluble T4 carried out 

30 using a computer program called Pepplot (University 
of Wisconsin Genetics Computer Group) according to 
J. Kyte and R. F. Doolittle, J. Mol . Biol. . 157, 
pp. 105-32 (1982). Line B depicts the protein domain 
structure of full length T4 [ Maddon et al . . (1985) 

35 supra] in which M S" repres nts the secretory signal 
sequence, "V" represents the immunoglobulin-lik 
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variable region sequence, "J" represents the immuno- 
globulin-like joining region sequence, »u" represents 
the unique, extracellular region sequence, "TM" 
represents the . transmembrane sequence and "C" repre- 
5 sents the cytoplasmic region sequence. In line E, 

the transmembrane amino acid sequence and some flank- 
ing sequence is written below the TM domain. Line c 
depicts the protein domain structure of recombinant 
soluble T4 mutants rsT4.1 in pBG377, rsT4.2 in pBG380 

10 and rsT4.3 in pBC381* Line D represents the protein 
domain structure of E.coli rsT4 gene (Met-perfect 
construct) <pl99-7) which is deleted for the T4 
N-terminal signal sequence (S), 

We constructed the first three soluble T4 

15 mutant gene fragments by truncating our full length 
soluble T4 cDNA at positions corresponding to either 
intron/exon boundaries or to protein domain boundaries 
defined by hydropathy analysis predictions. More 
specifically, we introduced synthetic linkers into 

20 the unique Aval site that is 5' to the transmembrane/ 
extracellular domain boundary to produce an in-frame 
translational stop codon, thus constructing T4 genes 
that lack the transmembrane and cytoplasmic domains 
of the full length T4 sequence. 

25 For example, mutant rsT4.1 in pBG377 was 

truncated by the insertion of a stop codon following 
aaino acid 362, lysine, which corresponds to the 
position of an intron separating the extracellular 
and transmembrane domain exons. The positions both 

30 of this intron and of the adjacent intron that splits 
the transmembrane and cytoplasmic domains were deter- 
mined by DNA sequence analysis of chromosomal T4 
clones isolated from the XEMBL3 genomic library 
described above* Although the significance of the 

35 intron positions flanking the T4 transmembrane domain 
is not known, the determinati n of the genetic struc- 
ture c uld provide important information for design- 
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ing rsT4 mutants, since exons frequently define 
functional domains fw. Gilbert, "Why Genes In Pieces?", 
Nature , 271, p. 501 (1978)]. 

We then constructed mutant rsT4.2 in pBG380 
5 by truncating the T4 cDNA at the boundary of the 
transmembrane and extracellular domains at amino 
acid 374. And, we constructed mutant rsT4.3 in 
pBG381 by truncating the T4 cDNA at amino acid 377, 
three amino acida downstream from the transmembrane/ 

10 extracellular domain boundary and within the trans- 
membrane domain. 

We also employed the technique of oligo- 
nucleotide site directed mutagenesis, according to 
D. Strauss et al., "Active Site Of Triosephcsphate 

15 Isomerase: In Vitro Mutagenesis And Characterization 
Of An Altered Enzyme", Proc, Natl. Acad, Sci . USA , 
82, pp. 2272-76 (1985), to construct a fourth soluble 
T4 mutant from our full length T4 clone PBL T4. The 
structure of this mutant is depicted in Figure 7A, 

20 line D, which represents the protein domain structure 
of E.coli rsT4 gene (Met-perfect rsT4.2) construct, 
deposited in pl99-7, which is deleted for the T4 
N- terminal signal sequence (S). 

We also constructed various other soluble 

25 T4 deletion mutants to determine which smaller 

fragments of the T4 sequence provide a protein which 
binds to HIV. These constructions were based on our 
belief that only the amino terminal sequence of T4 
is required for binding to HIV. This belief, in turn, 

30 was based upon observations that the monoclonal anti- 
body OKT4A blocks infection of T4 positive cells by 
HIV and it appears to recognize an epitope in the 
amino portion of T4 [ Fuller et al. . supra]. Such 
fragments of T4, which lack glycosylation and which 

35 are capabl of binding HIV and blocking infection, 

may be produc d in E.coli or chemically synthesized. 
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The structure of each of these deletion 
mutants is depicted in Figure 7B. In that figure, 
line A depicts the protein domain structure of full 
length T4 f Maddon et al . . (1985), supra; Figure 7Aj . 
5 In line B. the protein structure of recombinant 

soluble T4 mutants are depicted as follows: rsT4.T 
in p203-5, rsT4.7 in pBG392, rsT4.8 in pBG393 . rsT4.9 
in pBG394, rsT4.10 in pBG395, rsT4.11 in pBG397 , 
rsT4.12 in pBG396, rsT4.111 in pBG215-7, rsT4. 113.1 
10 in pBG211-ll and rsT4.113.2 in pBG214-10. 

We constructed soluble T4 derivatives 
p203-5, pBG392, pBG393, pBG394 and pBG396 by trun- 
cating our rsT4.2 gene after the StuI sites at amino 
acids 183 and 264 of rsT4.2. More specifically, we 
15 constructed derivative rsT4.7 in p203-5 and in pBG392 
by truncating the rsT4.2 cDNA at amino acid 182. 
And, we constructed each of derivatives rsT4.9 in 
pBG394 and rsT4.12 in pBG396 by truncating the rsT4.2 
cDNA at amino acids 113, and 166, respectively. One 
20 may also construct each of derivatives rsT4.10 in 
pBG395 and rsT4.11 in pBG397 by truncating the 
rsT4.2 cDNA at amino acids 131 and 145, respectively. 

Expression of T4 and Soluble T4 
Polypeptides In Bacterial Cells 

2 5 The cDNA sequences of this invention can 

be used to transform euxaryotic and prokaryotic host 
cells by techniques well known in the art to produce 
recombinant soluble T4 polypeptides in clinically 
and commercially useful amounts. 
30 For example, we constructed expression 

vector pl99-7, as shown in Figure 9A, as follows. 

we preceded the construction depicted in 
Figure 9A by the construction of various intermediate 
plasmids, as depicted in Figures 8A-8D. Those con- 
35 structions w re carried out using conventional 
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recombinant techniques. The linkers employed in 
those constructions are set forth in Figure 10. 

As depicted in Figures 8A and 8B, starting 
with pl70-2. which contains our full length T4 DNA 
sequence, coding for T4 characterized by three dif- 
ferent amino acids than that of Maddon et al . . (1985), 
supra, we produced various constructs which direct 
the expression of soluble T4. Some of these con- 
structs are characterized in that one or more of 
those amino acid differences have been changed to 
correspond to the respective amino acids of Maddon 
et I» this figure, as well as in the other 

figures, amino acid changes are reflected by an 
arrow. 

15 Plasmid pl92-6 contains the Met perfect 

rsT4.2 sequence derived by oligonucleotide site- 
directed mutagenesis which removed the entire T4 
N- terminal signal sequence as shown in Figure 8C. 
And, to provide a convenient means of transferring 
20 the rsT4.2 Met perfect sequence into E.coli expression 
vectors, the steps described in Figure SD were carried 
out to produce pl95-8, a plasmid containing the Met 
perfect rsT4.2 sequence flanked by Clal restriction 
sites. The Clal -Clal cassette of pl95-8 optimizes 
25 the distance between the 5' Clal site and the 
initiating Met codon. In Figure 8D, ST8 rop" 
i» * tetracycline resistance encoding pAT153- 
based plasmid containing the rop" mutation that 
permits high plasmid copy number, a promoter and 
riboaome binding site from bacteriophage gene 32 and 
the gene 32 transcription termination sequence. 

Cleavage of pl95-8 with Clal produced the 
fragment used to assemble pl99-7, a construction 
which directs the expression of Met perfect rsT4.2 
under the control of the P L promoter (Figure 9A). 

t* 1 * first step, to construct a vect r from which 
rsT4.2 expression is under contr 1 of th P L promot r. 
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we constructed the vector pr97-l2 from pl034 
(plmuGCSF) (Figure 9A). 

We then cut pl034 with EcoRI and BamHI to 
excise the GCSF cDNA insert and a portion of the 
5 phage mu ribosome binding site sequence — which we 
subsequently reconstructed with oligonucleotides. 
The synthetic linkers used were linkers 57-60 

(Figure 10). 

we then li gated the synthetic linker into 

10 the EcoRI /BamHI -cut pl034 to form pl97-12. One could, 
instead, replace these steps by starting with any 
suitable E.coli expression vector containing a Clal 
site appropriately placed between the promoter and 
terminator sequences. We cut pl97-12 with Clal and 

15 inserted a Clal-Clal cassette containing the cDNA 

sequence of rsT4.3 in pBG381 and phage transcription 
terminator derived from pl034. The sequence of this 
cassettg is depicted in Figure 11. The resulting 
plasmid, pl99-7, contains the rsT4.2 "Met perfect" 

20 gene in that vector. 

Alternatively, one could derive the Met 
perfect rsT4.2 sequence from plasmid pBG380, 
deposited in connection with this application, and 
gap out the signal sequence to create pl92-6. 
25 We tested for expression of pl99-7 as 

follow. SG936, an E.coli Ion htpr double mutant 
[ATCC 39624) IS. Goff and A. Goldberg, "ATP-Dependent 
Protein Degradation In E.coli w , in Maximizing Gene 
Expression , W. Reznikoff and L. Gold (eds.) (1986)], 
30 was transformed with pl99-7 by conventional proce- 
dures f M.T^tia et al. (1982)) to form SG936/pl99-7 . 
a transformant containing a plasmid with the Met- 
perfect rsT4.2 gene behind the P L promoter. Trans - 
formants were selected on LB agar plates containing 
35 10 mcg/ml tetracycline (t t). After str axing out 
several single colonies for single colony isolation, 
on was chosen at random for testing induction of 
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rsT4.2 synthesis. We picked a single colony from an 
LB-agar tet* plate into 20 ml Luria Broth (LB) and 
10 mcg/ml tet in a 125 ml shake flask and grew it 
overnight in a shaking air incubator (New Brunswick 
Scientific, New Jersey) at 30°C. 

We then initiated an induction culture by 
adding 0.5 ml of the overnight culture to 50 ml LB 
and tet in a 500 ml flask which was grown at 30°C in 
a shaking air incubator. When the culture reached 
an OD(600) of 0.4, we transferred it to a 42 °C water- 
bath and shook it gently for approximately 20 minutes. 
After heat induction at 42"C, the flask was trans- 
ferred to a 39°C air incubator (New Brunswick 
Scientific, New Jersey) where it was shaken vigorously 
15 at 250 rpm. We withdrew samples just after the 42°C 
heat shock, and at hourly time points for 4 hours, 
and then after overnight growth. The samples were 
measured for growth by OD(600) and analyzed following 
SDS-PAGE for the pattern of protein synthesis by 
20 Coomassie blue protein staining and by Western blot 
analysis with our rabbit antipeptide antibody probes 
(described above). Based on the relative molecular 
weight and protein blot analysis, the expression of 
rsT4.2 was induced from SG936/pl99-7 following heat 
25 induction at 42°C (Figure 12). 

We transformed pl99-7 into a P L mu.tet 
expression vector, an E.coli expression vector, at 
the unique clal site (see Figure 11). The nucleo- 
tide and amino acid sequences of pl99-7 are shown in 
30 Figure 11. 

Th« expression of soluble T4 from pl99-7 
in E.coli was measured by Western blot analysis of 
whole cell extracts following SDS-PAGE using the 
rabbit polyclonal anti-peptide JB-1 or anti-peptide 
35 JB-2 antib dies as probes (Figure 12). 

We also constructed xpression v ctor 
P203-5, as shown in Figure 9B, as follows. 
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We started with pl97-7. which has the same 
sequence as the ? T um vector p!97-12 (see Figure 9A), 
except that there~is a single nucleotide deletion in 
the 5 1 noncodmg region following the ? L promoter. 
That deletion, which is a deletion of nucleotide 
#40 - adenine -- of pl97-12 (see Figure 11), resulted 
from a deletion in the region that was constructed 
from linkers 57-60 (see Figure 10). pl97-7 contains 
the rsT4.2 gene comprising 374 amino acids. Alter- 
natively, one could also use pl97-7 as a starting 
plasmid. 

We cut pl97-7 with Clal. We also cut pl95-8 
(see Figures 8D and 9A) with Clal to remove the 
Clal - Cla l cassette containing the cDNA sequence of 
15 rsT4.2. Subsequently, we inserted the Clal -Clal 
cassette into pl97-7 to produce pl98-2. 

We then digested pl98-2 with StuI to 
remove 80 amino acids (amino acid 185 to amino acid 
264) of the mature T4 protein coding sequence. Unex- 
pected methylation, however, prevented cutting at 
the second StuI site, so that only the StuI site at 
amino acid 184 was cleaved. Following ligation, the 
plasmid DHA was transformed into E.coli and we 
examined several plasmid clones for the deletion 
25 using standard procedures. None of those plasmids 
contained the expected StuI deletion. 

Subsequent DNA sequence analysis of one 
of these plasmids, called p203-5, showed that two 
guanine residues (see amino acids 183 and 184; 
nucleotides 818 and 819 of Figure 3) of the StuI 
recognition sequence had been deleted following 
cleavage due to exonuclease digestion caused by the 
use of exonuclease-contaminated StuI enzyme. This 
dinucleotide deletion produced a translation frame- 
35 shift following amino acid 182 (glutamine) and intro- 
duced a stop c don six amino acid codons downstream 
from the frameshift (Figure 9C). Th unexpected 
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methylation of the second Stu I site together with 
the deletion that resulted in a new stop codon 
produced a gene encoding a shortened form of recom- 
binant soluble T4, called rsT4.7. The rsT4 . 7 sequence 
encodes a 182 ammo acid N-terminal segment of the 
mature T4 sequence followed by, at the O terminus, 
six amino acids — asparagine-leucine-glutamine- 
histidine-serine-leucine — of non-T4 sequence and 
finally by a TAA stop codon. 

The expression of soluble T4 from p203-5 
in E.coli was measured by Western blot analysis 
as previously described. 

Expression of T4 and Soluble T4 
Polypeptides In Animal Cells 

15 we inserted both soluble T4 genes and the 

unmodified gene encoding membrane-bound T4 into 
animal expression vector pBG368. More specifically, 
we inserted each of the soluble gene constructs into 
PBG368 under the transcriptional control of the 
adenovirus late promoter, to give plasmids pBG377, 
PBG380 and pBG381. We also made two pBG312-based 
constructions, called pBG378 and pBG379, which 
direct the expression of recombinant full length T4 
protein. pBG378 and pBG379 code for the same full 
length T4 protein but in pBG379, a portion of the 3' 
untranslated sequence has been removed. Subsequently, 
t° test for expression of recombinant soluble T4 and 
recombinant full length T4, we cotransf ected Chinese 
hamster ovary ("CHO") cells with one of each of 
30 those plasmids and with the plasmid pAdD26. 

We first constructed pBG368 as follows. 
As depicted in Figure 13, we cut animal cell expres- 
sion vector pBG312 [R. Cate et al. f "Isolation Of 
The Bovine And Human Genes For Mullerian Inhibiting 
35 Substance And Expression Of The Human Gen In Animal 
Cells", Cell , 45, pp. 685-98 (1986)] with EcoRI and 
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galll to delete one of each of the two EcoRI and the 
two Boll I restriction sites (the EcoRI site at posi- 
tion 0 and the Ball I site located at approximately 
position 99). The resulting plasmid. FBG366. retained 
an EcoRI site in the cloning region and a BcJ.II site 
after the cloning region. This left a single EcoRI 
site and a single Bglll site in the polylinker for 

cloning purposes . 

More specifically, we deleted one EcoRI 
site and one Ball I site by sequential partial diges- 
tion of pBG312 with restriction enzymes EcoRI and 
BolII, respectively, we filled in with Klenow and 4 
nucleotides then religated to produce pBG368, which 
contains unique restriction sites for EcoRI and Ball I 
15 enzymes . 

Once transient expression of soluble T4 
was verified, we constructed stable cell lines that 
continuously expressed soluble T4. To do this, we 
employed the stable cell expression host, the^ 
dihydrofolate reductase deletion mutant (DHFR ) 
Chinese hamster ovary cell line [F. Kao et al., 
"Genetics Of Somatic Mammalian Cells X Complementation 
Analysis of Glycine-Requiring Mutants", Proc. Natl. 
Acad. Sci. , 64, pp. 1284-91 (1969); L. Chasin and 
25 G. Urlab "Isolation Of Chinese Hamster Cell Mutants 
Deficient In Dihydrofolate Reductase Activity", 
Pr QC . Natl. Acad. Sci. , 77, pp. 4216-80 (1980)]. 

Using this system, we cotr ans f ected each 
T4 gene construct with P AdD26 [R. J. Kaufman and 
P. a. Sharp, "Amplification And Expression Of 
Sequences Cotransf ected with a Modular Dihydrofolate 
Reductase Complementary DNA Gene", J. Mol. BioK, 
159, PP- 661-21 (1982) containing the mouse DHFR 
gene. Before carrying out the co-trans fections , we 
linearized all plasmids by restriction enzym cleavage 
and, prior to transf ection, we mixed each plasmid 
with P AdD26 so that the molar rati of pAdD26 to T4 
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was 1:10. This maximized the number of 74 gene copies 
per trans fectant. 

Within the cell, the plasmids were ligated 
together to form polymers that can become integrated 
5 into host chromosomal sequences by illegitimate 

recombination [J. Haynes and C. Weissmann. "Constitu- 
tive. Long-Term Production Of Human Interferons By 
Hamster Cells Containing Multiple Copies Of a Cloned 
Interferon Gene", Nucl . Acids Rm. . h, pp . 687-706 
10 (1983); s. J. scahill et al . , "Expression And Charac- 
terization Of The Product Of A Human Immune Interferon 
cDNA Gene In Chinese Hamster Ovary Cells", Proc. Natl . 
Acad. Sci. USA, 80, pp. 4654-58 (1983)]. We selected 
transfectants that express the mouse OHFR gene in 
culture medium lacking nucleotides. We then subjected 
these transfectants to a series of increasing concen- 
trations of methotrexate, a toxic folate analogue 
that binds DHFR, to select for cells levels of DHFR. 

Resistance to methotrexate by increased 
expression of DHFR is frequently the result of DHFR 
gene amplification, which can include the reiteration 
of large chromosomal segments, called amplified 
units [R. J. Kaufman and P. A. Sharp, "Amplification 
And Expression Of Loss Of Dihydro folate Reductase 
Genes In A Chinese Hamster Ovary Cell Line", Molec. 
Cell. Biol., l, pp. 1069-76 (19ei)]. Therefore, 
cointegxation of DHFR and rsT4 sequences permitted 
the amplification of r«T4 genes. Stably trans fected 
cell lines were isolated by cloning in selective 
growth medium, then screened for T4 expression with 
a T4 antigen (RIA ) [D. Klatzmann et al . . Nature . 
312, pp. 767-68 (1984)] and by immunoprecipitation 
from conditioned medium after I 35 S] cysteine 
(" S-Cys") metabolic labelling. 
35 w * *l»o inserted the solubl T4 derivative 

rsT4.7 gene into an animal cell expression plasmid 
as f Hows. 
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As set forth in Figure 14C , we cur plasmid 
pBG381 (Figure 14A) with EcoRI and Nhel. we then cut 
pl86-6 with EcoRI and Nhel to remove the 786 base 
pair fragment. We ligated that fragment into the 
5 digested pBG381 to form plasmid pBG391 . The T4 
sequence in pBG391 is identical to both that of 
Maddon et al . (1985) supra at positions 64 (tryptophan) 
and 231 (phenylalanine) and to that of pBG381. How- 
ever, at position 3, the asparagine reported by 
10 Maddon et al ■ and present in pBG381 is replaced with 

lysine. The nucleotide sequence of pBG391 is depicted 

in Figure 15. 

We then digested p203-5 with Nhe l and 
Oxanl to remove the 483 base pair fragment. We 
15 inserted that fragment into Nhel/Oxanl-digested 
pBG391 to form plasmid pBG392, the animal cell 
expression construct of rsT4.7. The T4 sequence in 
rsT4.7 contains amino acids identical to that of 
Maddon et al . ' s full length sequence at amino acid 
20 positions 64 (tryptophan) and 231 (phenylalanine). 
However, at position 3, the asparagine reported by 
Maddon et al. is replaced with lysine. The nucleo- 
tide sequence of pBG392 is depicted in Figure 16. 

in Figure 14D, we have depicted the con- 
25 struction of other animal cell expression constructs 
containing sequences encoding the deletions rsT4.9 
in pBG394. and rsT4.12 in pBG396. Those constructions 
were carried out using conventional recombinant tech- 
niques. The linkers employed in those constructions 
30 are set forth in Figure 18. The nucleotide sequences 
of pBG394 and pBG396 are shown in Figures 19 and 20. 

Plasmid pBG393, shown in Figure 17, con- 
tains rsT4.8. the perfect form of rsT4.7. pBG393 
contains 182 amino acida of the mature T4 sequence, 
35 without the additional non-T4 6 amin acids at the 

C-terminus following amin acid 182. The nucleotide 
s quence of BG393 is shown in Figure 21. 
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Other animal cell expression plasmids 
according to this invention may be constructed as 
depicted in Figure 17. These include rsT4.10 in 
PBG395 and rsT4.11 in pBG397 (see Figure 18 for 
£ specific linkers). 

The nucleotide sequence of BG3 95 is shown 
in Figure 22. 

Purificati on Of Recombinant Soluble T4 

Recombinant soluble T4 construct pBG380 

10 expressed in DHFR" CHO cells was grown to confluency 
in a a -Modified Eagles Medium (Gibco) supplemented 
with 10% fetal calf serum, l mM glutamine and the 
antibiotics penicillin and streptomycin (100 wg/ml 
of each). The cells were grown at 37°C in two 21 Cell 

15 Factory Systems (Nunc). We then washed the confluent 
cells free of fetal calf serum with o -Modified Eagles 
Medium without fetal calf serum and cultured the 
cells in a-Modified Eagles Medium at 37°C for 4 days. 
Subsequently, we harvested the conditioned media, 

20 filtered it through a Millipore Minidisk 0.22p 

hydrophilic filter cartridge <Millipore #MCGL 305-01) 
and concentrated the secreted proteins on a fast-S 
ion exchange column (S-Sepharose Fast Flow, Pharmacia 
#17-0511-01) in 20 mM MES buffer (pH 5.5). 

25 **• then eluted the bound proteins with 20 mM 

Tris-HCl (pH 7.7) and 0.3 M NaCl. The elution pool 
was subsequently diluted with 2 volumes of 20 mM 
Tris-HCl (pH 7.7) and it was then loaded on a column 
comprising immobilized 19Thy anti-T4 monoclonal anti- 

30 body coupled to Affigel-10 [a gift of Dr. Ellis 
Reinherz, Dana Farber Cancer Institute, Boston, 
Massachusetts]. We washed the column extensively 
and eluted the bound material as 0.5 ml fractions 
with 50 mM glycine-HCl (pH 2.5). 150 mM NaCl, 0.1 mM 

35 EGTA and 5 wg/ml bovine pancreatic trypsin inhibitor, 
Apr tinin (Sigma #A1153). We used Western blots 
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developed with rabbit antisera raised against peptide 
JB-2 to follow the purification. We employed silver 
stained gels to follow binding and elution of rsT-i-2 
during the chromatography. Figure 23 depicts a 
5 Coomassie stained gel of purified rsT4.2. 

Gel sizing-column chromatography analysis 
of the purified rsT4.2 from the pBC380 trans fee ted 
CHO cell line, BG380G, suggests that rsT4 is monmeric 
under physiologic pH and salt concentration. 

10 Sequencing Of Recombinant 
Soluble T4 Protein 

We then determined the N-terminal amxno 
acid sequence of a recombinant soluble T4, specifi- 
cally rsT4.2, molecule purified from the conditioned 

15 medium of the pBG380 trans fected CHQ cell line BG80G, 
as described above, by automated Edman degradation 
in an Applied Biosystems 470A gas phase sequenator 
[R. B. Pepinsky et al., J. Biol Them. . 261, 
pp. 4239-46 (1986)] . 

20 The amino terminal sequence matched the 

sequence which we had previously determined for 
solubili2ed native T4 isolated from U937 cells, supra . 
The amino terminal sequences of native solubilized 
T4 (sT4) and purified rsT4 protein are A2 proteins, 

25 as compared to the amino terminal sequence predicted 

by Maddon et al . , (1985), supra, with the mature amino 
terminus located at position 3 of that sequence. The 
amino terminal sequences of solubilized native T4 
(sT4), recombinant soluble T4 (rsT4.2) secreted by 

30 CHO trans fectant BG380G containing pBC380 and the 
protein sequence deduced by Maddon et al . (198S), 
supra are as follows: 



sT4: 
rsT4*2: 



X-K-V-V-L-X-K-K-X-D-T-V-E-L-T-X-T-A-S-E- 
N-K-V-V-L-G-K-K-G-D-T-V-E-L-T-X-T-A-S-E- 



-56- 



Maddon 
et al . 



g-G-N-K-V-V-L-G-K-K-G-D-T-V-E-I 



.-T-C-T-A-S-E 



10 



15 



20 



25 



30 



35 



In the above sequences, the amino acids 
are represented by single letter codes as follows: 



Phe: 
Val : 
Ala: 
Asn: 
Cys: 



F 
V 
A 
N 
C 



Leu : 


L 


lie: 


I 


Met: 


M 


Ser: 


s 


Pro: 


P 


Thr : 


T 


Tyr : 


Y 


His : 


H 


Gin: 


Q 


Lys : 


K 


Asp : 


0 


Glu: 


E 


Trp: 


w 


Arg: 


R 


Gly: 


G 



X: not determined or ambiguous. 

We also constructed pBG211-ll, a plasmid 
coding for the N-terminal 113 amino acids of soluble 
T4 protein. This construct, which codes for a pro- 
tein characterized by a single disulfide bridge, 
between the cysteines at amino acid positions 18 and 
86, is conveniently expressed in E.coli . 

To construct p211-ll, as depicted in Fig- 
ure 24, we first cut pl95-8 (see Figures 8D and 9A) 
with Clal to remove the Clal -Clal cassette contain- 
ing the cDNA sequence of rsT4.2. We then digested 
pAT153Y3SH16AAmp, the tryptophan operon promoter 
plasmid from the gamma interferon producing E.coli 
strain BN374 with Clal, and deleted the cDNA coding 
for gamma interferon. Subsequently, we inserted 
the Clal -Clal cassette into the CI a I -cut e . coli 
plasmid in front of the tryptophan operon promoter 
and ligated to produce pl96-10. 

As shown in Figure 25, we then subjected 
PBC380 to oligonucleotide-directed mutagenesis to 
insert three tandem tranalational stop codona follow- 
ing the T4 cDNA sequence coding for amino acids -23 
to 113 in pBG380, lo produce pBG394. 

We then constructed p211-ll from fragments 
of each of pl96-10. pBG394 and pl034 as depicted in 
Figure 26. The first fragment including the vector 
sequences, was produc d by restricting pl96-10 with 



# 



-59- 

Hindlll and Clal to remove the T4 coding sequence 
Trim ammo acids 61 through 374 of rsT4 . 2 and includ- 
ing vector sequence following the 3' end of the rsT4 
gene. The second fragment, a EindUI - Belli segment 
5 including the codons for T4 ammo acids 61-113 of 
rsT4.S immediately followed by a triplet of stop 
codons in tandem, was isolated by Hindi I I/Bol 1 1 diges- 
tion of pBG394. The third fragment, a BamEI - Clal 
fragment containing a bacteriophage T4 transcriptional 
10 termination signal [H. N. Kirsch and B. Allet, 

"Nucleotide Sequences Involved In Bacteriophage T4 
Gene 32 Translational Self-Regulation", Proc. Natl. 
Acad. Sci. USA , 79, pp. 4937-41 (1982)], was isolated 
by BamHI/Clal digestion of p!034. we then ligated 
15 these three fragments to produce p211-ll. a T4 con- 
struct coding for a 113 amino acid soluble form of 
T4 protein, with asparagine at amino acid position 3 

(i.e.. rsT4.113.1). 

We then subjected p211-ll to oligonucleo- 
20 tide site-directed mutagenesis (Figure 27) to change 
the amino acid at position 3 from asparagine to 
lysine using the oligonucleotide T4-66: 



25 



71 AAA 



ATG CAG GGT 



rA GTA 



30 



AAA GTA GTA CTG 
GGC 3' . 

This produced plasmid p214-10, a fully 
corrected 113 amino acid soluble T4 vector coding 
for a 113 amino acid soluble form of T4 protein., 
with lysine at amino acid position 3 (i.e., 
rsT4.113.2). As shown in Figure 27. we subjected 
P214-10 to oligonucleotide site-directed mutagene- 
sis to delete glutamine and glycine at, respectively. 
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amino acid positions I and 2 of the T4 sequence usmc 
the oligonucleotide T4AID-87 : 

C 



i 



5 * GTA TCG ATT TGG 
ATG ATG AAA AAA 
GTA GTA 3 • . 

This produced p215-7, a 111 amino acid 
soluble T4 construct, including the trp promoter, 
which directs the expression of a 111 amino acid 
soluble form of T4 protein, with lysine at amino 
acid position 3 (i.e., rsT4.111). 

We next constructed p218-8, a 111 amino 
acid construct which directs the expression of a ill 
amino acid soluble form of T4 protein, with lysine 
15 at amino acid position 3 (i.e., rsT4.11l) under the 

control of the P L promoter, as depicted in Figure 28. 

More specifically, we cut pl97-i2 (Figure 
9A) with Clal to remove the 101 bp fragment contain- 
ing linker and terminator sequences. We also cut 
P215-7 with Clal to remove the Clal - clal cassette 
containing the cDNA sequence of rsT4.1li and the oT4 
transcriptional terminator sequence [ Kirsch and Allet . 
supra] . Subsequently, we inserted the Clal - cla l 
cassette into the Clal-cut pl97-l2 to produce p218-8. 

In order to express rsT4. 113.1, we trans- 
formed E.coli A89 with p211-ll by conventional 
techniques [Maniatis et al . (1982), supra] to form 
E-coli A89/p211-ll. E.coli A89 is a tetracycline 
sensitive derivative of E.coli SG936. We isolated 
E.coli A89 ^om E.coli SG936 according to the method 
of S. R. Maloy and W. D. Nunn, "Selection For Loss 
Of Tetracycline Resistance By Escherichi a coli " , 
J. Bact., 145, pp. 110-12 (1981), which is based 
upon the ability of the lipophilic chelating agent 
35 fusaric acid t selectively inhibit resistant strains. 
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More specifically, we plated E.coli SC936 on medium 
containing, per liter. 5 g tryptone. 5 g yeast extract. 
10 g NaCi, 10 g NaH.PO^O, 50 mg chiortetracycime- 
HC1, 12 mg fusaric acid, 0 . 1 mM 2nCl 2 and 15 c agar. 
Colonies which grew at 30°C (putative tetracyciine- 
sensitive strains) were retested for tetracycline 
sensitivity on L-agar plates containing 5 pg/ml 
tetracycline. One tetracycline-sensitive strain, 
designated A89, was then shown to be unable to grow 
on LB agar at 42°C, thus verifying the presence of 

the htpR mutation. 

Trans fonnants were selected by tetracycline 

resistance. We picked a single colony into 20 ml of 
minimal medium plus 0.2% casamino acids plus trypto- 
15 phan (100 pg/ml) plus tetracycline (10 pg/ml) in a 

100 ml shake flask placed in a shaking air incubator 
at 30°C and allowed the cells to grow up overnight. 
The following morning, we inoculated 40 ml of mini- 
mal medium plus 0.2% casamino acids plus tryptophan 
20 (100 pg/ml) plus tetracycline (10 pg/ml) with the 

overnight culture at OD 6Q0 = 0.05 in a 500 ml flask. 
The cells were grown to midlog phase and then induced 
by pelleting, washing once in minimal medium and 
then resuspending in minimal medium plus 0.2% cas- 
25 amino acids plus tetracycline (10 pg/ml), in the 
absence of tryptophan. We removed 0.6 OD 60Q of 
cells after 0, 1. 2, 3 and 4 hours incubation and 
after growth overnight. 

The aliquots were centrifuged and cell 
pellets were subjected to lysis by boiling in 
Laemmli gel loading buffer. After centrifugation to 
remove cell debris, half of each sample was subjected 
to SDS-PAGE, followed by Western blot analysis with 
our rabbit antip ptid antibody probes or by Coomassie 
35 blue protein staining (Figures 29A and 29B). 
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Purification Of rsT4 . 113.1 

we then purified rsT4. 113.1 from the E.coli 
trans form ant by means of two essentially quantitative 
steps involving anion-exchange and gel-filtration 
5 chromatographies performed under reducing and dena- 
turing conditions. 

More specifically, we suspended 14 g of 
wet cells from a 4 L shake-flask fermentation in 
100 ml of a 20mM Tria (pH 7.5) buffer containing 

10 20 pg/ml DNaae, 20 yg/ml RNase and 1 mM phenylmethyl- 
sulfonyl fluoride ("PMSF"). The suspension was applied 
to a French Press at 1000 psi in two passages and 
then centrifuged in an SA 600 rotor at 18,000 g for 
IS min at 4°C. The resulting pellet was solubilized 

15 in 20 ml of a 20 mM Tris (pH 7.5) buffer containing 
7 M urea and 10 mM 2-mercaptoethanol . We then sub- 
jected the suspension to ultracentrifugation at 
85,000 g for 90 min at 4»C. The supernatant was 
diluted by the addition of 80 ml of 20 mM Tris 

20 (PH 7.5) buffer containing 7 M urea and 10 mM 
2-mercaptoethanol and 40 ml of the sample was 
applied to a 3 x 4 cm Q-Sepharose fast- flow column 
(Sigma, St. Louis, Missouri) which had been pre- 
equilibrated in the same buffer. The column was 

25 developed with a gradient in 400 ml total volume of 

increasing NaCl from 0 to 0.3 M in the same Tris/urea/ 
2-mercaptoethanol buffer. Column fractions were 
monitored for absorbance at 280 nm and for protein 
content by SDS-PAGE (15* acrylamide). The fractions 

30 were also analyzed by Western blots. Figure 30, 

panel (a) is a chromatogram displaying the purifi- 
cation of rsT4. 113.1 by ion-exchange chromatography. 
In that figure, peaks containing rsT4. 113.1 are 
identified. The rsT4. 113.1 was found to elute early 

35 in the NaCl gradient and to be well-resolved from 
low-m le cular weight contaminants. 
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in order to separate rsT4. 113.1 from high- 
molecular weight contaminants, we carried out gei- 
filtration chromatography on an rsT4 . 113 - I-ccntaminc 
pool for final purification of the protein to near 
homogeneity (>95% purity). More specifically, we 
prepared a pool containing 20 mg of protein in 5C mi 
and then concentrated to 10 ml in a stirred-cell 
ultrafiltration unit (Amicon. Danvers, MA.) using a 
PM-30 membrane (Amicon). Subsequently, 5.0 ml of 
the concentrate was applied to a 1.5 x 95 cm S-300 
column (Sigma) equilibrated and developed in the 
same Tris/urea/2-mercaptoethanol buffer. We moni- 
tored the column fractions for absorbance at 280 nm 
and for protein content by SDS-PAGE. The fractions 
were also analyzed by western blots. A pool con- 
taining rsT4. 113.1 (approximately 4 mg) in 15 ml was 
thus prepared. Figure 30. panel (b) is a chromato- 
gram displaying the purification of rsT4.}13.1 by 
gel- filtration separation of the rsT4. 113.1 pool. 
In that figure, peaks containing rsT4. 113-1 are 
identified. 

Figure 30, panel (O is an SDS-PAGE analysis 
depicting the purification of the rsT4 derivative 
throughout the centrifugation and chromatography 
steps, in Figure 30. panel <c>. the lanes depicted 
are: 

lane A: molecular weight standards 
lane B: cell extracts 

lane C: cell pellet following solubilization 
of cell extract in non-denaturing 
conditions 

lane D: supernatant following solubilization 
of cell extract in non-denaturing 
buffer 

lane E: supernatant following ultracentri- 
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lane F: Q-Sepharose pool 
lane G: S-300 gel-filtration pool. 

Refolding Of Purified rsT4. 113.1 

We refolded the purified rsT4 . 113.1 by 
dilution and dialysis steps to non-denaturing and 
oxidized conditions. More specifically, refolding 
of the protein at a concentration of 0.5 OD (280)/ml 
was achieved by stepwise dialysis against 500 volumes 
of 3 M urea, 20 mM Tris (pH 7.5); 500 volumes of 1 m 
urea, 0.1 M ammonium acetate (pH 6.8) and, finally, 
the same volume of a phosphate-buffered saline solu- 
tion. Throughout the refolding procedure, samples 
of the protein were monitored for relative content 
by spectral analysis and by high-performance liquid 
chromatography ("HPLC") performed on a 1S0A liquid 
chromatographic system (Applied Biosystems, Inc., 
Foster City, California). An octasilyl column 
(Aquapore RP-300, 0.46 x 3.0 cm) was equilibrated in 
80% 0.1% trifluoroacetic acid ( M TFA" )/water (sol- 
20 vent A) and 20% 0.085% TFA/70% acetonitrile (sol- 
vent B) and developed with a linear gradient of 
increasing acetonitrile concentration from 20% to 
80% (solvent B) over 45 min at a flow rate of 
0.5 ml/min. 



15 



25 



As shown in Figure 31. panel (a), protein 
in 7 M urea, 10 bh 2-mercaptoethanol and 20 mM 
Tris(pH 7.5) eluted from the HPLC column at 49% 
acetonitrile in the gradient. In subsequent steps, 
from 1 M urea/1 mM ammonium acetate (pH 6.8) [Fig- 
30 ure 31, panel (b)] to phosphate buffered saline 

[Figure 31, panel (c)]. an increasing percentage of 
rsT4. 113.1 was found to elute earlier in the HPLC 
gradient — at 47% acetonitrile. The identity of 
the earlier eluting peak as oxidized product was 
verified by reduction f rsT4. 113.1 in non-chaotropic 
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solutions and application of sample thus treated to 
HPLC under the same conditions. 

The elution of oxidi2ed rsT4. 112.1 prior to 
reduced protein on EPLC suggests that formation of 
the single disulfide bridge decreases relative hydro- 
phobicity of the protein [J. L. Browing et al . . Anal . 
B iochem. , 155. pp. 123-28 (1986)]. Spectral analysis 
_f rsT4. 113.1 was performed throughout the course of 
refolding in order to monitor relative yield of solu- 
10 ble protein in the procedure. The refolding method 
allowed approximately 20% recovery of rsT4. 113.1. 
HPLC analysis indicated a less than 15% contaminant 
of reduced protein in the preparation (Figure 30, 
panel (c), lane G). 

15 Sequencing Of Renatured rsT4.113 

We then carried out amino acid analysis of 
rsT4. 113-1 by automated Edman degradation in an 
Applied Biosystems 470A gas phase sequenator equipped 
with a 900 A data system. Phenyl thiohydanti on amino 
20 acids generated during the course of the degradative 
chemistry were analyzed on-line using an Applied 
Biosystems 120A PTB-analyzer equipped with a PTH-C18 
2.1 x 220 mm column. Protein (10 wg) for sequence 
analysis was applied to SDS-PAGE (15% acrylamide) 
25 and electroblotted on an Immobilon membrane (Millipore 
Corp.. Bedford. Massachusetts) as described by 
P. Matsudaira. J. Biol. Chem. . 262. pp. 10035-38 
(1987). 

Amino acid analysis of protein samples was 
30 performed by hydrolysis of protein in 6 N HC1, in 
vacuo, for 24 h at 110-C. The hydrolysates were 
then applied to a Beckman 6300 Analyzer equipped 
with post-column detection by ninhydrin. Western 
blot analysis of the SDS-PAGE gels was carried out 
35 by standard techniques using rabbit antisera JB-1. 
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Sequence analysis revealed an amino terminal sequence 
of: Met-GIn-Gly-Asn-Lys-Val-Val ... 

The purified rsT4 . 113.1 protein was found 
to contain stoichiometric quantities of the ami no - 
terminal methionine placed in the protein construct 
for expression in E.coli and an intact polypeptide 
chain consistent with a sequence derived from the 
plasmid construction. Recovery of phenylthiohydan- 
toinyl-methionine at the first cycle of the degrada- 
tive chemistry was 60% consistent with routine initial 
yields obtained in the automated Edman. This obser- 
vation excludes the possibiity that a significant 
percentage of the rsT4. 113.1 lacked the initiation 
methionine, i.e., the NE 2 -methionine was not removed 
15 by expression of rsT4. 113.1 in E.coli . or that sequence 
analysis was impaired by the presence of glutamine 
at the first cycle of the degradative chemistry. 
Sequence analysis was performed for 40 cycles and no 
evidence of lysine carbamylation was observed. Amino 
acid analysis displayed a close correlation of actual 
and theoretical values for amino acids, thus indicating 
the marked absence of proteolytic degradation in the 
course of expression, or purification, or both. 

Immunoprecipitation Of CHO Cell 
25 Lines Producing Soluble T4 

We tested the conditioned media from 35 S-Cys 
metabolically labelled CHO cells transfected with 
one of the T4 mutant constructs pBG377, pBC380, pBG381, 
the full length recombinant T4 construct pBG379, of 

30 this invention or vector only, to determine whether 
any produced a molecule recognized by the anti-T4 
monoclonal antibody 19 Thy. To carry out this test, 
we incubated about 10 7 CHO cells transfected with 
either pBG380, pBG381, pBG377, pBG379 or pBG312, for 

35 5 hours at 37°C with 180 M Ci/ml 35 S-labelled cysteine 
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[DuPont, New England Nuclear] in 4 ml RPMI cys~ medium 
(Gibco). After labelling of the cells, 1 ml of fil- 
tered, conditioned media was made 0.5 mM with phenyl - 
methyl -sulphonyl fluoride and immunoprecipi tated 
5 with OKT4 and protein A Sepharose [p. e. Sayre and 
£. L. Reinherz, Eur. J. Immunol. , 15, pp. 291-95 
(1985)]. Subsequently, we incubated media from the 
35 S-labelled cells with 0KT4 (ATCC #CRL 8002). We 
then immuno-precipitated with protein A Sepharose 

10 and subjected the immuno-precipitates to SDS-PAGE 

under reducing conditions on 10% polyacryl amide gels 
[U. K. Laemmli, Nature . 227, pp. 680-85 (1980)]. 
Autoradiography was carried out with X-Oroat X-ray 
film (Eastman Kodak). 

15 As shown in lanes 3-5 of Figure 32, both 

PBG380 (rsT4.2) and pBG381 (rsT4.3) directed the 
synthesis of a secreted, immune, 3S S-labelled T4 
protein that was recognized by the OKT4 anti-T4 
antibody. The immunoprecipitated truncated mole- 

20 cules migrated as 49 Kd proteins, a result consis- 
tent with their predicted molecular weights. In 
contrast, no soluble T4 antigen could be detected in 
the conditioned media of cell lines stably trans- 
fected with pBG377 (rsT4.1) or pBG379 (rflT4). 

25 Immunoprecipitation analysis of cellular extracts of 
cell lines tranafected with pBG377 suggests that the 
rsT4.1 gene may be mis folded, which could account 
for a block in its secretion [M. J . Gething et al . , 
Cell, 46, pp. 939-50 (1986)], 

30 In Figure 32, the lanes represent the 

following: Lane 1 : immunoprecipitation from condi- 
tioned medium of CHO cells stably co-trans f ected 
with vectors pBG312 and pAdD26. Lane 2 : blank. 
Lanes 3 and 4 : immunoprecipitation from conditioned 

35 medium of CHO cells stably co-transf ected with pBG380 
(rsT4.2) and pAdD26. Lanes S and 6 ; immunopr cipita- 
tion from conditioned medium of CHO cells stably 
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co-transfected with pBG381 (rsT4.3) and P AdD26. 
Lane_7: lmmunoprecipi tation from conditioned medium 
of CHO cells stably co-transfected with recombinant 
full length T4 (pBG379) and pAdD26. In Figure 32, 
the arrow indicates the predicted position of the 
soluble T4 from pBG380 or pBG381 relative to the 
migration of standard molecular weight markers. 

lmmunoprecipi tation Of COS 7 Cell Lines 
Producin g Reeowi binant Soluble T4 

w « expressed recombinant soluble T4 
derivatives pBG392, pBG393 and pBG394 in COS 7 cells 
by electroporation, essentially as described by 
G. Chu et al.. "Electroporation For The Efficient 
Transfection Of Mammalian Cells With DNA" , Nuc. 
15 Acids Res., 15, pp. 1311-26 (1987). More specifi- 
cally, we introduced 20 pg closed circular plasmid 
DNA and 380 pg of carrier (sonicated salmon sperm 
DNA) into 3 x 10 7 COS 7 cells. The cells were 
electroporated using a Gene Pulser (Biorad) set at 
20 300 volts. Subsequently, we incubated the COS 7 

cells in Dulbecco's Modified Eagle's Medium supple- 
mented with 10% fetal calf serum for 24 hours. We 
then harvested the conditioned media, filtered it 
through a Millipore Minidisk 0.22 p hydrophilic 
25 filter cartridge (Millipore #MCGL 305-01) and 

concentrated the secreted proteins on a fast-S ion 
exchange column (S-Sepharose Fast Flow, Pharmacia 
#17-0511-01) in 20 mM MES buffer (pH 5.5). 

We then eluted the bound proteins with 
30 20 mM Tris-HCl (pH 7.7) and 0.3 M NaCl. The elution 
pool was subsequently diluted with 2 volumes of 20 mM 
Tris-HCl (pH 7.7) and it was then loaded on a column 
comprising either 19Thy anti-T4 monoclonal antibody 
and protein A Sepharose or 0KT4A and protein A 
35 Sephar se. We washed the column extensively and 

eluted the bound material as 0.5 ml fractions with 
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50 mM glycine-HCi (pH 2.5). 150 mM NaCl. C.l mM EGTA 
and 5 pg/al Bovine pancreatic trypsin inhibitor. 
Aprotinin (Sigma. #A1153). The immunoprecipitates 
were subjected to SDS PAGE (10% gel) followed by 
immune-blotting against rabbit antisera raised 
against peptide JB-1- We employed silver stained 
gels to follow binding and elution of rsT4 during 

chromatography. 

Figure 33 depicts an immunoblot analysis of 

transiently expressed pBG3 9 2 (rsT4.7) [lanes 10, 
11]; pBG393 (rsT4.8) [lanes 4, 7, 8] and pBG394 
(rsT4.9) [lane 5]. The standards are 50 ng purified 
rsT4.3 (lane 1); 150 ng purified rsT4.3 (lane 2) and 
250 ng purified rsT4.3 (lane 3). The arrow indicates 
the expected position of migration of a protein with 
the relative molecular weight of rsT4.7: 21.000 
daltons. The sample that was to be loaded into lane 4 
was lost and lanes 6 and 9 are blank. 

As shown in lanes 10 and 11 of Figure 35. 
PBG392 (rsT4.7) directed the synthesis of a secreted, 
immune protein that was recognized by the anti-T4 
antibodies OKT4A and 19Thy. Lanes 4, 7 and 8 also 
demonstrate that pBG393 (rsT4.8) directed the 
synthesis of a secreted, immune protein that was 
25 recognized by 0KT4A and 19Thy. This analysis 

illustrates that rsT4.7 contains the 0KT4A epitope. 
It also suggests that the binding region for HIV 
•nvelope binding resides in the amino 182 terminal 

residues of T4. 

in contrast, no soluble T4 could be detected 
in the media of cell lines transfected with pBG394 
(rsT4.9) [see lane 5]. immunoprecipitation analysis 
of cellular extracts of cell lines transfected with 
PBC397, however, showed that rsT4.9 was recognized 
35 by 0KT4A. we believe that rsT4.9, a 113 amino acid 

construct, bind, the HIV virus and that it represents 
a second g neration soluble T4. one with only two 
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cysteines and one of three disulfide bridges. 
Accordingly, rsT4 . 9 is easily produced in E.coli or 
yeast systems. 

Similarly, although no soluble T4 could be 
detected in the media of cell lines transfected with 
PBG396 <rsT4.12), analysis of cellular extracts of 
those cell lines showed that rsT4 . 12 was recognized 
by 0KT4A. Thus, rsT4.12 may also bind HIV virus . 

Radioimmunoassay And Epitope 
Analysis Of rsT4.113 



In order to determine if the 113 fragment 
of rsT4 contained structural determinants for binding 
to 0KT4A, Leu-3A and OKT4, we then carried out 
radioimmunoassay and epitope analysis of rsT4.1l3 
15 using a competitive inhibition radioimmunoassay 

[C. J. Newby et al., "Solid-Phase Radioimmune Assays" 
in Handbook Of Experi mental Immunology , d. M. Weir 
(Ed.), 1, pp. 34.1-34.8 (1986)]. As 0KT4A and Leu-3A 
block infectivity of HIV in vitro [ Dalgleish et al. . 
supra] and binding of T4 to gpl20/160 [ McDouoal et al . . 
supra], this analysis served as a first approximation 
as to whether or not rsT4.113 contained structural 
elements for interaction with HIV. 

We first coated U-bottom 96 well microti ter 
plates (Falcon) with 50 ul/well goat-anti -mouse IgG 
(Hyclone Typing Kit. Logan. Utah) in PBS (pH 7.0) to 
a concentration of 50 pg/ml and incubated the plates 
overnight at 4°c. w« then rinsed the plates with 
IX PBS and blotted them dry. The plates were then 
blocked by the addition of 100 pl/well of a U PBS 
solution containing 5% bovine serum albumin for 
1 hour at room temperature. We rinsed the plates 
with PBS. blotted dry and then spotted them with 
50 Ml of one of three antibody solutions containing 
35 either 0KT4 (10 pg/ml in block buffer); OKT4A 
(500 ng/ml in block buffer) r Leu-3A (B ct n- 
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DicJcinson) (500 ng/ml in block buffer). We let the 
plates stand for 2 hours at room temperature. We 
then washed the plates 3 times with a PBS/0.05% 
Tween-80 solution and 2 times with IX PSS and blotteo 
5 them dry. 

in a separate plate, we titrated competitor 
samples of unlabeled rsT4. 113.1 from 20 pg/ml and 
serially diluted twice (including no competitor con- 
trol), with final volumes in each well of 25 pi. 
10 The positive control for this assay was competition 
with unlabeled rsT4.3 (375 amino acids), we then 
added 25 M l of 125 I-rsT4.3 containing 10,000 
cpm/25 pi (prepared according to A. E. Bolton and 
W. M. Hunter, Radioimmun oassay An<< P»l*ted Methods, 
15 Chapter 2c). Subsequently, we spotted the entire 

50 Ml content of each well onto the assay plate con- 
taining each of the three antibody solutions and 
incubated for 2 h at room temperature. We then 
washed the plates 3 times with a PBS/0.5% Tween-80 
20 solution and 2 times with IX PBS, blotted them dry 

and then counted the wells in a Beckman gamma counter 
for radioactivity. 

As shown in Figure 34. rsT4. 113.1 competes 
with 125 I-rsT4.3 for absorption to an 0KT4A solid 
25 phase in a dose-dependent manner. Additionally, 

rsT4. 113.1 compete, with 125 l-rsT4.3 for absorption 
to a Leu-3A solid phase in a dose-dependent manner. 
By comparison to unlabeled rsT4.3, rsT4. 113.1 exhibits 
a molar affinity for those antibodies within a factor 
30 of 3. In the 0.4 to 25 vq/ml concentration range 

tested, rsT4.113 did not compete with radiolabelled 
rsT4.3 for binding to 0KT4. In a similar assay, we 
observed that rsT4.111 also competes with I-rsT4.3 
for binding to OKT4A and Leu-3A, but not to OKT4 

35 [Figures 35-37]. 

Based on thes results, w b lieve that 
the epitopes for OKT4A and Leu-3A are contained within 
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the amino- terminal 113 amino acids of T4 . we also 
believe that the epitope for 0KT4 binding is localized 
within the carboxy terminal of the T4 polypeptide. 

Accordingly, we believe that the gp!20- 
bindmg domain is localized within the ammo terminal 
113 or 111 amino acids of the T4 protein. Based on 
this belief, we synthesized various synthetic oligo- 
peptides which contain sequence within that structural 
domain. These oligopeptides are represented in 
Figure 3 as follows: 

Oligopeptide Amino Acid Coordinate 

J*' 1 44-63 
rsT4 #6 18 . 29 

rsT4 #7 5 _ 56 

rsT4 #8 84 . 9? 

«T4 #9 30 _ 63 
we synthesized these peptides using conventional 
phosphoamide DNA synthesis techniques [ Tetrahedron 
Letters, 22, pp. 1859-62 (1981)]. we synthesized 
the peptides on an Applied Biosystems 380A DNA 
Synthesizer and- purified them by gel electrophoresis. 

ELISA A ssay For rsT4.113 

We also carried out an ELISA assay for 
rsT4. 113.1 produced by P 211-ll-transformed E.coli 
Throughout this assay, dilutions were made in block- 
ing solution and, between each step, we washed the 
Plates with PBS/O. 05% Tveen-20. More specifically, 
we coated wells of Immulon 2 (Dynatech, Chantilly, 
Virginia) plates with .005 OD (280 nm)/ml of 0KT4 
(IgG2b) in 0.05 M bicarbonate buffer to a volume of 
50 pi/well and incubated the plates overnight at 
4»c. we then blocked the plates with 5% bovine 
serum albumin in PBS , 200 pl/well, and incubated for 
30 minutes at room temperature. 

Subsequently, we added 50 M l of 50 ng/ml 
rsT4.3 to each w 11, incubating overnight at 4«c. 
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We then added 50 pl/vell of a mixture containing 
rsT4. 113.1 and 10 ng/ml of 0KT4A and incubated for 
2 1/2 hours at room temperature. Using a Hyclone 
Kit (Hycione), we then carried out the following 
5 steps." First, we added 1 drop of rabbit anti-mouse 
IgG2a to each well and incubated the plates for 
1 hour at room temperature. We then added 100 pi of 
peroxidase-labeled anti-rabbit IgG, diluted 1:4000 
with blocking buffer to each well, and incubated for 
10 1 hour at room temperature. 

we prepared a substrate reagent as follows, 
we diluted substrate reagent 1:10 in distilled water 
and added two o-phenyl-ethylene-diamine ("OFD") 
chromophore tablets per 10 ml of substrate. We let 
15 the mixture dissolve thoroughly by mixing with a 

vortex. Alternatively, a TMB peroxidase substrate 
system (Kirkegaard & Perry Catalogue #50-76-00) may 
be used. Subsequently, we added 100 pi of the 
chromophore solution to each well, incubated for 
20 10-15 minutes at room temperature and then stopped 
the color development with 100 pi of IN H-SC^. We 
then measured OD at 490 nm, using an ELISA plate 
reader. 

The results of the assay are demonstrated 

25 in Figure 38. 

W« then subjected the soluble T4 proteins 
produced by the T4 constructs of this invention to 
various functional assays. 
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i.m* y* Of The Antiviral A ctivity Of Soluble T4 
The antiviral activity of soluble T4 
according to this invention was evaluated using 
modifications of various in vitro systems used to 
study antiviral agents and neutralizing antibodies 
[D. D. H et al., "Recombinant Human Interferon Alpha 
35 (A) Suppresses HTLV-I1I Replication In Vitr ", 
Lancet , pp. 602-04 (1985); K. Hartshorn et al., 
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"Synergistic Inhibition Of HTLV-m Replication 
In Vitro By Phosphono formate And Recombinant Inter- 
feron Alpha-A", Antimicrob Ao Chemoth , 30, pp. 189-91 
(1986)] . 

5 For each of these assays, we prepared graced 

concentrations of soluble T4 and preincubatec them 
with an H9 derived IIIB isolate of HIV [a gift from 
Drs. M. Popovic and R. Gallo, National Cancer 
Institute, Bethesda, Maryland]. The isolate was 

10 maintained as a chronically infected culture in H9 
cells. Cell-free HIV stocks were obtained from 
supernatant fluids of HTLV-in infected H9 cultures 
(culture conditions: 1 x 10 6 cells/ml with 75% viable 
cells). We prepared serial 10 fold dilutions of 

15 recombinant soluble T4 ranging from 10 picograms/ml 
to 10 micrograms /ml and incubated them with fifty 
50% tissue culture infectious doses (TCID 5Q ) of HIV 
for 1 hour at 37 °C, in RPMI-1640 supplemented with 
20% heat inactivated fetal calf serum (FCS). We 

20 then added 150 pi of H9 cells to a final concentra- 
tion of 0.5 x 10 6 cells/ml which were not HIV- infect d 
to the wells containing aliguots of the recombinant 
soluble T4/HIV mixture. 

We adjusted each virus inoculum to a con- 

25 centration of 250 TCID 50 /ml. We preincubated 100 pi 
of the virus inoculum with 200 pi recombinant solu- 
ble T4 or 100 pi immunoglobulin prepared in tripli- 
cate serial 2-fold dilutions for 1 hour at 37°c 
prior to inoculation onto 1.5 - 2 x 10 6 H9 cells in 

30 5 ml RPMI 1640 supplemented fetal calf serum (20%), 
HEPES (lOmM), penicillin (250 U/ml ) , streptomycin 
(250 pg/ml) and L-glutamine (2mM). On days 5, 6, 7, 
10 and 14, we examined each culture for characteris- 
ic cytopathic effects ("CPE"). Neutralization was 

35 defined as the inhib ition of syncytia formation com- 
ared with c ntr la. 
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The positive control used was HIV seroposi- 
tive neutralizing serum, as describee m D . D. He 
et ai., "Human Immunodeficiency Virus Neutralizing 
Antibodies Recognize Several Conserved Domains On 
5 The Envelope - Glycoproteins" , J. Virol- . 61. 

pp. 2024-28 (1987). The negative controls used 
were HIV seronegative serum only and buffer only. 

evtooathic Effect As say (CPE) 

in this assay, following conventional 
10 protocols for cytopathic effect assays [ Klatzmann 

et al . (1984), supra and Wono-Staa l and Gallo (1985). 
supra] , we microscopically examined the H9 cells for 
evidence of cytopathic effects of HIV. 

The CPE was scored on a four point scale 
15 from 1+ to 4+, with 4+ representing the highest 

degree of CPE. 

By day 14, wells containing recombinant 
soluble T4 according to this invention (rsT4.2. 
derived from the pBG380 trans fected CHO cell line 
20 BG380) at 10 pg/ml showed no evidence of CPE, while 
the negative control showed 1* to 3+ CPE. 

p24 Red* oimmu noassay 

we then tested soliible T4 as an inhibitor 
of viral replication in an HIV virus replication 
25 assay according to D. D. Ho et al., J • Virol.. 61. 
pp. 2024-28 (1987) and J. SodxosKi et al . . Nature. 
322. pp. 470-74 (1986). We carried out the assay 
essentially as described, except that the cultures 
were propagated in microtiter wells containing 
30 200 pi. in this assay, we evaluated the ability of 
the soluble T4 polypeptides of this invention to 
block HIV replication, as measured by HIV p24 
antigen production. We sampled supernatants twic 
w ekly for HIV p24 antigen as described below. 
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We obtained an assay kit [HTLV-m p24 
Radioimmunoassay System, Catalogue No. NEK-040, 
NEK-040A, Biotechnology Systems, New Research 
Products, Dupont] which contains affinity purified 
5 2 labelled HIV p24 antigen, a rabbit anti-p24 

antibody and a second goat anti-rabbit antibody which 
is used to precipitate antigen- antibody complexes. 
We caxried out the assay according to the protocol 
included with the kit. Accordingly, we mixed a sample 

10 to be assayed or one of a series of amounts of 

unlabelled p24 antigen with a fixed amount of 125 i 
labelled p24 and a fixed limited amount of rabbit 
anti-p24 antibody. We incubated the samples over- 
night at room temperature and then added a goat 

15 anti-rabbit immunoglobulin preparation for 5 minutes 
at 40°C. we centrifuged the samples in a microfuge 
and aspirated the supernatant fluid. Pelletted 12 5 I 
labelled p24 was quantitated for each sample by gamma 
counting and a standard curve for the 125 I p24 dis- 

20 placed by the known amounts of antigen added to 

Stail 125 d tUbe> W " c = riS ' tru< =t.ed. We then calculated 
the I 1 rolled P 24 displaced by the antigen present 
in the unknown samples by interpolation using the 
standard curve constructed from the known amounts of 

25 p24 antigen cc ned in the standard samples. The 
results are sk in the table below. 
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rsT4 .2 

Day (ug/ml) 



0.5* 
5.0** 



10 



51 V REPLI CAT 


ION INBIBI 


TION 


Patient 
Serum 


Average 
CPM 


% Bound/ 
Unbound 


Negative 
Positive 


344 
2,237 

551 
1, 766 


6.5 
112 .4 
19 . 9 
86.6 


Negative 
Positive 


230 
2,459 
322 
1,980 


2.2 
124 .6 
7.3 
96.3 


Negative 
Positive 

mm 


221 
2,284 

246 
1,988 


1.8 
115.0 
3.1 
98 .7 



10 0.5* 
5.0** 

14 

0.5* 

15 5.0** 

These results demonstrate that soluble T4 
according to this invention at a concentration of 
5 ug/ml completely inhibits virus replication as 
measured in this standard 14 day assay. These 

20 results are also depicted in Figure 39 in graphic 
form. In Figure 39, values were calculated from a 
standard curve of p24 according to assay kit 
instructions . 



25 * This concentration was initially believed to be 
1 0 ug/ml, based upon our preliminary approximation 
that 1 unit of absorbance at 280 nm <"*28o!J' ,2S"n« 
equivalent to 1 mg of rsT4.2. AbsorbanCi at 280 nm 
is a commonly used first approximation of protein 

30 concentration. Upon amino acid analysis of the pro- 
tain, however, we found that it had a higher extinc- 
tion coefficient than originally approximated, witn 
1 jl 80 unit of rsT4.2 being equivalent to 0.5 mg oi 
the^protein. 

35 ** This concentration was initially believed to 

be 10 ug/ml, based upon our preliminary approximation 
that 1 unit of absorbance at 280 nm ("A 2Q0 ). was 
equivalent to 1 mg of rsT4.2. Absorban2| at 280 nm 
is a commonly used first approximation of P* 0 ** 1 * 

40 concentration. Upon amino acid analysis of the pro- 
tein, however, w found that it had a higher extinc- 
tion coefficient than originally approximated, w ith 
1 A 3ao unit of rsT4.2 being equival nt to 0.5 mg of 
the protein. 
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We then carried out a p24 replication assay 
as described above, except that the soluble T4 was 
added to the infected cultures during refeeding at 
days 3, 7 and 10, in order to maintain a constant 
5 rsT4 concentration throughout the infection period. 
The results of this assay are shown in the table 
below. 

INHIBITION OF HIV REPLICATION 
WITH CONSTANT CONCENTRATION OF rsT4 

10 TST4.2 p 24 

(pg/ml) (ng/ml) 

0.008 770 

0.031 97 0 

0.125 85 

15 0.5 0 

5.0 0 

0 112 o 

uninfected q 



20 



These results demonstrate that when solu- 
ble T4 protein according to this invention was main- 
tained at a constant concentration throughout the 
infection period, as little as 0.125 pg/ml of the 
protein substantially blocked replication of 250 
TCID 50 /ml of HIV-l. 

25 Advantageously, soluble T4 protein accord- 

ing to this invention, at concentrations far exceed- 
ing those required to block viral replication, did 
not exert immunotoxic effects in vitro , as measured 
by three lymphocyte proliferation assays — mixed 

30 lymphocyte response, phytohemagglutinin , and tetanus 
toxoid stimulated response. 

Syncytia Inhibition Assay 

To further assess the effect of soluble T4 
on HIV env-T4 binding, we evaluated the effect of two 
35 preparations of our soluble T4 protein on the syn- 
cytiagenic properties of HIV in th co-cultivation 
assay. We carried out a C8166 cell fusi n assay 
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as described ia B. D, Walker et al . , Proc . Natl. 
Acad. Sex. USA , 84, pp. 8120-24 (1984). 

We incubated 1 x 10 9 H9 cells chronically 
infected with HTLV-iiib for 1 hour at 37° c in 5% 
5 CC 2 with various concentrations of one of two 

preparations of rsT4.2 in ISO pi RPMI-1640 media 
supplemented with 20% fetal calf serum. We then 
added 3 x 10 4 C8166 cells in SO pi media (a T4* 
transformed human umbilical cord blood lymphocyte 

10 line [ Sodroski et al. , supra], to a final volume of 
0.2 ml in each well. Final well concentrations of 
soluble T4 were 0.5 Mg/ml* and 5.0 Mg/ml* for prepa- 
ration #1 and 1.25 M9/ml* and 12.5 Mg/ml* for prepa- 
ration #2. We then counted total number of syncytia 

15 per well at 2 hours and 4 hours after adding the 

C8166 cells at 37«C in 5% COj. Parallel co-cultiva- 
tions used buffer alone (negative control) or 0KT4A 
at 25 Mg/ml (positive control) as controls. We con- 
sidered a positive result as a 50% reduction in 

20 syncytia compared to controls, at a time when at 
least 100 syncytia per 10 4 infected H9 cells were 
present in the control cultivations. The results of 
this assay are shown below and in Figure 40 (2 hour 
data ) . 



* These concentrations were initially believed 
to be, respectively, 1 vig/ml, 10 Mg/ml, 2.5 Mg/ml 
and 25 vg/ml, based upon our preliminary approxima- 
tion that 1 unit of absorbance at 280 nm ("A- ), 
30 was equivalent to 1 mg of rsT4.2. Upon amin6 8 8cid 
analysis of the protein, however, w found that it 
had a higher extinction co fficient than originally 
approximated, with 1 A- flA unit of rsT4.2 being 
equivalent to 0.5 mg or*the protein. 



PCT/lS8S/02**0 



-SO- 



10 



15 



20 



INHIBITION IN C8166 FUSION ASSAY 

*/. Tnl 

Preparation f rs3 



21 


(M<3/ml) 


2 Hrs 


4 Hrs 


0 




0 


0 


0 


.5** 


30 


42 


5 


. 0** 


54 


47 


1 


.25** 


16 


21 


12 


.5** 


77 


55 


0 




100 


100 



buffer 
rsT4 . 2 
rsT4 . 2 
rsT4 . 2 
rsT4 . 2 

OKT4A (25 pg/ml) 



As demonstrated in this table and in Fig- 
ure 40, soluble T4 according to this invention at 
5.0 pg/ml and 12.5 pg/ml inhibited syncytia formation 
at 2 hours, as compared to buffer alone. By 4 hours 
after the addition of C8166 cells, soluble T4 at 
12.5 pg/ml continued to inhibit greater than 50% 
syncytia formation, as compared to the negative 
control. 

We also evaluated the effect of two prep- 
arations of our soluble T4 protein rsT4.7 on the 
syncytiagenic properties of HIV in a similar co- 
cultivation assay. The results of this assay are 
shown below. 



All assays were carried out in triplicate, and 
25 the number of syncytia counted per well was averaged 
to calculate % inhibition. The X inhibition repre- 
sents the difference between the average number of 
syncytia in the negative control (without rsT4 or 
OKT4A) and the average number of syncytia counted 
30 when either rsT4 or 0KT4A were present during the 
assay, divided by the average syncytia count for 
the negative control and multiplied by 100. 

** These concentrations were initially believed 
*°.. b S£ *e*P«ctively, l pg/ml, 10 Mg/ml , 2.5 pg/ml 

35 and 25 pg/ml, based upon our preliminary approxima- 
tion that 1 unit of absorbance at 280 nm ( "A " ) 
was equivalent to 1 mg of rsT4.2. Upon amin0 8 2cid 
analysis f the protein however, w found that it 
had a higher extinction coefficient than originally 

40 approximated, with 1 A, flft unit f rsT4.2 being 
equivalent t 0.5 mg of 8 the protein. 
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INBIBITION IN C8166 FUSION ASSAY 
figcsav da-p; nav 1 

Average 

rsT4.7 syr.cytia/SOyl % Inhibition 
Preparation ( ug/ml ) aliquot at 2 hrs 

H9 cells 0 0 N/A 

(control ) 

C8166 cells 0 0 N/A 

(control) 

HIV-infected 0 118 0 

H9 cells 
added to 
C8166 cells 
( control ) 



15 OKT4A 

( control ) 

Prep. 1 of 
rsT4 . 7 



100 



S 5.0* 43 63.6 



20 * This concentration -/as initially believed to 
be 10 ug/ml. based upon our preliminary approxima- 
tion that 1 unit of absorbance at 280 nm ( A 2 8Q • U 
was equivalent to 1 mg of rsT4.2. Upon aminfi Scid 
analysis of th prot in. how ver, we found that it 

25 had a higher extinction coeffici nt than riginally 
approximated, with 1 A MQ unit f rsT4.2 being 
equivalent to 0.5 mg orth« protein. 
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Assav date: day 13 



Average 

rsT4.7 Syncytia/50 M 1 % Inhibition 
Preparation ( g <?/ml ) alicruot at 2 Hrs 

5 H9 cells 0 0 N/A 

(control) 

C8166 cells 0 1 n/A 

(control ) 

HIV-infected 0 141 o 

10 H9 cells added 
to C8166 cells 
(control ) 

0KT4A (control) 0 o 100 

Prep. 2 of 

15 rsT4.7 S 5.0* 27 80.9 



* This concentration was initially believed to 
be 10 wg/ml, based upon our preliminary approxima- 
tion that 1 unit of absorbance at 280 nm ( "A- ), 
20 was equivalent to 1 mg of rsT4.2. Upon aminC 8 8cid 
analysis of the protein, how ver, we found that it 
had a higher extinction c efficient than originally 
approximated, with 1 A. fln ur t f rsT4.2 being 
quivalent t 0.5 mg t°th .rotein. 



% % 
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Averaae 

rsT4.7 Syncytia'/50ul % Inhibition 
Preparation (uq/ml ) aliquot. a - 2 Hr£ 



R9 cells 
(control ) 

C8166 cells 
(control ) 



0 0 N/A 

0 0 N/A 



HIV-infected 0 128 

10 H9 cells added 
C8166 cells 
( control ) 



0KT4A (control) 0 0 100 

3 5.0* 35 72.7 



Prep. 1 of 
15 rsT4.7 



Prep. 2 of 
rsT4 . 7 



S 5.0* 2 98.4 



As demonstrated in these tables, soluble 
T4 protein rsT4.7 inhibited syncytia formation in 
HIV-infected H9 cells. 

we also evaluated the effect of rsT4. 113.1 
and rsT4.111 on the syncytiagenic properties of HIV in 
a co-cultivation assay, we carried out a C8166 cell 
fusion assay as described in walker et al.. supra. 
25 we incubated 1 x 10 4 H9 cells chronically 

infected with HTLV-IIIB for 1 hour at 37 °C in S% 
C0 2 . with from 5 to 50 M g/ml rsT4. 113.1 or rsT4.111 
in 150 pi RPMI-1640 media supplemented with 20% 
fetal calf serum in 96-well microti ter plates, we 



30 



35 



* This concentration was initially believed to 
be 10 pg/ml, based upon our P"^"^ ap * A ° Xim r 
tion that 1 unit of absorbance at 280 nm ( . A 2 8Q ? ' A 
wt! equivalent to 1 mg of rsT4.2. Upon 
analysis of the protein, howev r. we found that it 

a higher extinction coefficient than originally 
approximated, with 1 A„ fi unit of rsT4.2 being 
equivalent t 0.5 mg of 0 the protein. 



10 



15 



20 



-84- 

then added 3 x 10 4 C8166 cells to the wells in 50 u l 
aliguots. The plates were incubated for 2 hours at 
37 ° c in 5 % C0 2 and < following this incubation, the 
number of syncytia per well were counted. 

Syncytia were defined as cells containing 
a ballooning cytoplasm greater than three cell 
diameters. All samples were counted twice. Paralle 
co-cultivation used 0KT4A alone or rsT4.3 alone at a 
concentration of 25 wg/ml (positive controls) or H9 
cells alone or C8166 cells alone (negative controls) 
The results of this assay are shown below and in 
Figure 41. 



INHIBITION IN 


C8166 FUSION 


ASSAY 


Preparation 


rsT4(uo/ml) 


% Inhibition 


H9 cells (control) 


0 


0 


C8166 cells (control) 


0 


0 


rsT4. 113.1 


1.25 


35 


rsT4.113 .1 


2.5 


63 


rsT4 . 113 . 1 


4.25 


63 


rsT4 . 113 . 1 


6.25 


82 


rsT4 . 113 . 1 


12.5 


96 


rsT4.3 


12.5 


100 


0KT4A (25 pg/ml) 


0 


100 



As demonstrated in this table and in 
25 Figure 41, rsT4. 113.1 exhibited a dose-dependent 

inhibition of HIV-induced syncytia formation. The 
molar specific inhibitory activity of rsT4. 113.1 
appeared to be reduced by an order of magnitude by 
comparison to anti-viral activity of longer forms of 
30 recombinant soluble T4. Thus, whereas rsT4. 113.1 is 
effective toward neutralization of HIV-d pendent 
c 11 fusion in vitro , its molar specific inhibitory 
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activity is decreased by a factor of 10. It is 
undetermined whether this decreased potency is cue 
to incomplete renaturation of the £. col i -derived 
protein, the presence of three additional amino 
5 acids at the N-terminus of rsT4. 112.1 ( Met-Gln-Gly ) 
lacking in rsT4.2 or rsT4.3 produced in mammalian 
cells, or the absence of additional structure in 
rsT4. 113.1 required for high-affinity binding to 
HIV. 

10 We also carried out a C8166 cell fusion 

assay with rsT4.111, as described for rsT4. 113.1. 
The results of this assay are shown below. 

INHIBITION IN C8166 FUSION ASSAY 
Preparation 

15 H9 cell (control) 

C8166 cells (control) 

rsT4.111 

rsT4 . Ill 

rsT4.111 
20 rsT4.111 

rsT4 . Ill 

rsT4 . Ill 

rsT4.3 

rsT4.3 
25 0KT4A (25 M9/ml) 

As demonstrated in this table, rsT4.111 
exhibited a dose -dependent inhibition of HIV-induced 
syncytia formation. At a concentration of 12.5 pg/ml 
and 25.0 pg/ml, complete inhibition of cell fusion 
30 was achieved. 

Kinetics Of Intramuscular Injection Of Soluble T4 

we examined the kinetics of the appearance 
of a recombinant soluble T4 protein according to 
this invention (specifically, rsT4.3 from the pBG381- 
35 transfected cell line BG381) in serum after intra- 
muscular inj ction as follows. 



rsT4(uo/ml) % Inhibition 

0 o 

0 0 

1.25 0 

2.5 40 

4.25 20 

6.25 67 

12.5 100 

25.0 100 

12.5 100 

25.0 100 

0 100 
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We obtained two cynomolgus monkeys ( Macaca 
fascicularis) who were free of infectious disease 
and in good health. Each monkey had been subjected 
to a 6 week quarantine period prior to administration 
of the soluble T4 protein. Throughout the adminis- 
tration period, each monkey was maintained on a con- 
ventional diet of monkey chow supplemented with fresh 
fruit. A catheter and a vascular access port were 
surgically placed in a femoral vein of each animal 
prior to treatment in order to facilitate blood 
collection. 

Over a period of 28 days, each animal 
received recombinant soluble T4 protein twice daily 
by intramuscular injection to the large muscles of 
15 the thighs or buttocks. Injections were administered 
to each animal 8 hours apart and each injection con- 
tained a volume of 0.15 ml/kg (0.25 mg/kg) of rsT4.3 
(from the pBG381-trans formed c^ll line BG381). for a 
total dose of 0.5 mg/kg/day/monkey . Serum samples 
for clearance determination were collected on day l 
before the first treatment and at 1, 2, 4 and 
8 hours after the first injection, as well as 1, 2. 
4, 14 and 16 hours after the second injection on 
days 7, 14 and 28. 

We found that intramuscularly injected 
soluble T4 reached the maximum level in serum between 

1 and 2 hours after injection, with the level falling 
off slowly and reaching half-maximum value at approxi- 
mately 6 hours post- injection. According to data 
obtained for intravenous administration (not shown), 
the level of rsT4.3 in serum should drop below that 
attained via intramuscular injection aproximately 

2 hours after intravenous injection. Thus, while 
the maximum rsT4.3 level in serum after intramuscular 
injection does not reach that attainable via intra- 
venous injection, it is slowly released int the 
blood stream, remaining detectable in serum for a 



20 



25 



30 



35 



% * 
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much longer time. This slow release mechanism asso- 
ciated with intramuscular routes of injection is 
advantageous because a higher level of soluble T4 
protein is available over a longer period of time 
over a given concentration; thus remaining in a sus- 
tained level- intramuscular administration of solu- 
ble T4 protein is particularly useful in treating 
early stage HIV-infected patients, to prevent the 
virus from disseminating, or in treating patients 
who have been exposed to the virus and who are not 

yet seropositive. 

We determined serum levels of rsT4.3 using 



an EL ISA assay. Throughout this assay, dilutions 
were made in blocking solution and, between each 
15 step, we washed the plates with PBS/0.05% Tween-20. 
More specifically, we coated wells of Immulon 2 
plates with .01 OD (280 nm)/ml of 0KT4 (IgG2b) in 
0.05 M bicarbonate buffer to a volume of 50 pl/well 
and incubated the" plates overnight at 4°C. we then 
20 blocked the plates with 5% bovine serum albumin in 
PBS, 200 pl/well, and incubated for 30 minutes at 
room temperature. 

Subsequently, we added 50 pi of sample or 
standard to each well, incubating for 4 hours at 
25 room temperature. We then added 50 pl/well of 0KT4A 
at 0.1 pg/ml and incubated overnight at 4°C. Using 
a Hyclone Kit (Hyclone) we then carried out the fol- 
lowing steps. First, we added 1 drop of rabbit 
anti-mouse IgG2a to each well and incubated the 
30 plates for 1 hour at room temperature. We then added 
100 pi of peroxidase-labeled anti-rabbit IgG, diluted 
1:4000 with SX BSA/PBS to each well, and incubated 
for 1 hour at room temperature. 

we prepared a substrate reagent as follows. 
35 w diluted substrate reagent 1:10 in distilled water 
and added tor O-phenyl- thyl n -diamine ("OPD") 
chromophore tablets p r 10 ml of substrate, wit 
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the mixture dissolve thoroughly by mixing with a 
vortex. Alternatively, a tme peroxidase substrate 
system (Kirkegaard & Ferry Catalogue #50-76-00) may 
be usee. Subsequently . we added 100 pi of the 
5 chromophore solution to each well, incubated for 
10-15 minutes at room temperature and then stopped 
the color development with 100 pi of IN H SO . We 
then measured OD at 490 nm, using an ELIs! p?ate 
reader . 



10 



The resulta of the assay are demonstrated 
in the tables below. 
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10 



15 



Monkey #7-91 



Timethr ) 

0 
1 
2 
4 
5 
8 

9** 

10 
12 
22 
24 



Day 


1 


22 . 


7* 


278. 


8 


281 . 


8 


214. 


9 


72. 


.3 


246 


.2 


259 


.6 


136 


.0 


23 


.8 


13 


.4 



:sT4 Leve. 
< r.c/ml ) 

Bav " 



96.5 
199 .6 
366.8 
246.6 

105.0 



Day 14 

158.0 
360 .7 
306 .4 
363 .9 

199.4 



Day 26 

19-6 
238.3 
441 . I 
393 .2 
29C .4 



20 



25 



30 



Monkey #7-92 



Time(hr) Day_l 



35 



40 



0 
1 
2 
4 
5 
8 

9** 

10 
12 
22 
24 



rsT4 Level 
( no /ml ) 

Day 7 



56.0 
225.8 
377.9 
167.3 

101.2 



Dav 14 

106 . 3 
178.0 
253.2 
308.2 

176.5 



* 

** 



Day 28 

60.9 
437.7 
770.6 
821.5 
898.3 



6.7* 
87.2 
254.2 
170.0 

118.9 
405. 1 
523.5 
371.5 
48.4 
39.4 

• background 

- second injection administered after the 
Collection of the 8 hour sample. 



T^ v^lent Fo — Recombinant. Soluble T4 

Receptors may be characterized by their 

affinity for specific ligands, such that, at equili- 

af unity v ,** initv (K ) between monovalent 

brium, the intrinsic affinity l* a > 

receptor and monovalent ligand can be defined as 
t RL]/tR f nL f ], vh re [BL] is th c ~ ' 
receptor (R) bound to ligand (L) and [» £ 1 and IL,] 
are the cone ntrations of fre r ceptor and ligand, 
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respectively [P. a. Underwood, in Advances In Virus 
Research, ed. K. Maramorosch et al . , 34, pp. 2S3-3C° 
(1988)]. 

For a polyvalent receptor (with a valency 
5 of n) binding to a polyvalent ligand (with a valency 
of m). a functional affinity can be defined as 
n[R b ]/n[R f ]m[L f ], where [R b J is the concentration of 
bound receptor sites, and n[R f ] and m[L f ] are, respec- 
tively, the concentrations of free receptor and 

10 ligand binding sites. The effect of increasing the 
valence (the number of binding sites) is to enhance 
the stability of ligand- receptor complexes. The 
affinity of a polyvalent receptor for a polyvalent 
ligand will depend on three factors: the intrinsic 

15 association constant of each binding site, the 

valency (number of binding sites) and the topico- 
logical relationship between the receptor and ligand 
binding sites. Under some circumstances, polyvalent 
binding interactions will lead to higher functional 

20 affinity. The decreased dissociation rate of poly- 
valent ligands with polyvalent receptors results in 
an increased functional affinity [C. L. Hornick and 
F. Karush, Immunochemistry . 9, pp. 325-40 (1972); 
I. Ottemess and F. Karush, "Principles Of Antibody 

25 Reactions", in Antibody As A Tool , ed. J. J. 

Marchalonais and G.W. Warr, pp. 97-137 (1982)]. 

The simplest case for receptor polyvalency 
increasing functional affinity is represented by a 
bivalent soluble receptor, such as an antibody 

30 molecule, which has two. identical ligand binding 

sites, each capable of independently binding antigen 
with equal affinity. if the antigen is displayed 
polyvalently, for example, chemically coupled to a 
solid support such that the spacing between antigenic 

35 sites can be bridged by the antibody's two antigen 

binding arms, the functional affinity of the antibody 
for the antigen coupled to th solid support would b 
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15 



greater th« «- i™ic affinity of the 
bi »di», «« for tt. „onovaien* antigen 0. 
•nd E Metzger. '-.i°^e»istry, 9. PP ■ 
"t 2)) Because virus particles represent poXy- _ 
v".l.r.-- antioens, the greater functional affinity o. 
bodies for poiyval.nt nu,«ns i. an *portan. 
factor for antibody-directed vir^s neutralization 

The association of recombinant soluble T4 
and the HIV major envelope glycoprotein gpl20 is an 
r*am£ of monovalent receptor binding to — 
HgaL. Th. affinity of this i""» ct " n 
measured, and the ^^^T ^ 
has a dissociation constant K d - * x i" 
et al.. cell. SO. pp. "5-88 (1987)]. 

T.in, th. antibody analogy. « believe 
that polyvalent r.T4 will demon.tr.te • 
affinity for HIV-infected cell. di.pl»ym 9 «P«0 
'"an monovalent r.T4 and th. tcpi=ologic.= ^ relation- 
ship between op!20 on th. vi«. particl. or th. 
„ tnf.ct.d cell .urfac will d.t.r-in. th. d.gr«e to 
° tt Jch polyvalent r.T4 exhibit, higher functional 
affinity than moncv.Lnt r.T4. On. example of a 
'"yvalent r.T4 i. d..crib.d below, with respect to 
^auction of a recombinant biv.l.nt rsX4 con 
„ si.ting of two tandem r.p.at. of amino acid. 3-176. 
foltowtd by th. C-t.rmin.1 199 amino acid, of rsT4.3. 

this invention, a "polyvaLnf receptor 
no...».! two or .or. binding sit., for a giv.n 
Ug»d ^th.r-ore. th. intrin.ic affinity of ..ch 
30 Unbinding .it. of a 9 iv.n polyvalent receptor 

M ed not ^-«-f n Fi?ure 4J , „ co n.tr.ct biva!ent 

, et#d HBG391 with Nhel, which cleaves 
r« "tl. v! r .Ho-ition xTi. r.T4 and removed 
35 ^ overhang with mung bean nucle^ -t. 
w cleaved with ISl" « remov. th c t 
ot th r.T4 coding in PBG39X. Finally. 
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ligat d a Drai-Bcin fragment containing the codirc 
sequence for rsT4 anuno acids 3 u ysine) thr 3 * ? 
(isoleucin.) to the cleaved pBG391 to create pBiv i 
a piasmid coding for a fusion protein with a tandem' 
explication of the N-terminal 176 ammo acids of 
rsT4, followed by the C-terminal 199 ammo acids o' 
rsT4.3. The protein produced by this piasmid 
therefore, contains two adjacent N-terminal gpi20- 
bindmg or OKT4A-binding domains (defined by amino 
acid residues 3 through 111 of rsT4.111), followed 
by one OKT4-binding C-terminal domain (Figure 43) 

pBiv.l was transfected by electroporation 
into COS 7 cells to test expression of the bivalent 
rsT4 protein. Three days later, we tested the con- 
ditioned medium of the transfected cells for the 
presence of the rsT4 bivalent protein by immuno- 
precipitation, followed by Western blot analysis of 
the precipitated protein. Both OKT4A and OKT4 were 
used for immuno-precipitation to determine that the 

IZ fT.T "* ^ lM8t ° f ^ OKT4A epitopes 

had folded correctly. Both antibodies precipitated 

;^ eiB ° f P"<*cted apparent molecular weight 
<60,000d) from the conditioned medium of the cell! 

Bivalent rsT4 may be purified by immuno- 
affinaty purification from an 0KT4 column and the 
purxfied protein may then be used to perform quanti- 
tative competition assays with rsT4.3. We believe 
that the bivalent molecule would demonstrate equi- 
valent competition against rsT4.3 for 0KT4 binding, 
but significantly greater competition against mono- 
valent rsT4 for OKT4A binding. The ability of 
bivalent recombinant soluble T4 to block syncytium 
formation may also be demonstrated in the C8166 
fusion assay. We also believe that bivalent 
r combinant soluble T4 would block syncytium 
formation at significantly lower cone ntrations 
than monovalent rsT4; based up n the high r 
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functional affinity of bivalent recombinant 

soluble T4 for gpl20. 

According to alternate embodiments o. this 
-mention, other methods for producing polyvalent 
T sT4 may be employed. For example, polyvalent rsT4 
may be produced by chemically coupling rsT4 to any 
clinically acceptable carrier molecule a polymer 
selected from the group consisting of Ficoll poly 
ethylene glycol or dextran, using conventional 
coupling techniques. Alternatively, rsT4 may be 
chemically coupled to biotin. and the biotin-rsT4 
con^^te then allowed to bind to avidin resulting 
in tetravalent avidin/biotin/rsT4 molecules. And 
rS T4 may be covalently coupled to dinitrophenol 
(DNP ) or trinitrophenol (TNP) and the resulting 
conjugate precipitated with anti-DNP or 
Ig m! to form decameric conjugates with a valency of 

10 for rsT4 binding sites. 

Alternatively, a recombinant chimeric 
antibody molecule with rsT4 sequences substituted 
for the variable domains of either or both of the 
immunoglobulin molecule heavy and light chains may 
be produced. Because recombinant soluble T4 
possesses gp!20 binding activity, the 
of a chimeric antibody having two soluble T4 domains 
L havin, unmodified constant region domain. . could 
serve a. a mediator of targeted killing of HIV- 
infected cells that express gpl20. 

For example, chimeric rsT4/lgG 1 may be 
produced from two chimeric genes - an rsT4/human 
kappa light chain chimera (rsT4/C kappa ) ana 
rsT4/human gamma 1 heavy chain chimera 
( rsT4/C , ) • c ka©pa c gamma-l 

itbr.ri... «- «=* »" «— tB " 
bacteri.1 «c r.sistance or b»cte«»l 9P* 
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for selection in animal cell hosts against the 
antibiotic G418 or mycophenolic acid, respectively 

To construct "T4/C and rsT4/C. 

cmmeric genes, an rs74 gene segment, including a? 
least the secretory signal sequence and the N-terminal 
110 amino acid residues of the mature rsT4 coding 
sequence and including a splice donor or portion 
thereof, is placed upstream of the gamma-l and kappa 
constant domain exons . a suitable restriction 
enzyme may be used to cut within the intron down- 
stream of the desired rsT4 coding sequence, thus 
providing a donor splice site. Subsequently, a 
suitable restriction enzyme is used to cut within 
the introns upstream of the kappa and gamma- 1 
15 coding regions. The rsT4 sequence is then joined to 
the kappa or gaama-1 constant region sequence, such 
that the rsT4 intron sequence is contiguous with the 
gamma-1 and kappa introns. in this way, an acceptor 
splice site is provided by the kappa or gamma-1 
constant region intron. Alternatively, rsT4 chimeric 
genes may be constructed without the use of introns 
by fusing a suitable rsT4 cDNA gene segment directly 
to the gamma-1 or kappa coding regions. 

The «T4/C ^ and "T4/C vectors 
may then be cotr ans f ected , for example, toy electro- 
poration into lymphoid or non-lymphoid host cells. 
Following transcription and translation of the two 
chimeric genes, the gene products may assemble into 
chimeric antibody molecules. 

Expression of the chimeric gene products 
may be measured by an enzyme-linked immune ads orb ant 
assay (ELISA) that utilizes monoclonal anti-T4 anti- 
body 0KT4A, as described infra, or in gpl20 competi- 
tion assays and radioimmunoassays, as described infra. 
Activity f the rsT4/IgG 1 chim ras may be measured 
by incubating them with HIV-inf cted cells in the 
presence of human complement, followed by quantitating 
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subsequent complement-mediated lysis of these c.xxs. 
Alternatively, activity may be measured in HIV repli- 
cation and HIV syncytium assays as described infra. 

In order to determine if bivalent rsT4 has 
c a greater potency than monovalent rsT4 , we mixed 
OKT4 at various concentrations, together with a 
constant concentration of rsT4, so that the molar 
ratio of OKT4:rsT4 varied between 0.2 and 4 . After 
preincubating the mixture overnight at 4-C. we added 
10 aliquots to the HIV syncytium assay described infra 

0KT4 has no observable effect in this assay when used 
alone. In addition, the concentration of recombinant 
soluble T4 chosen did not cause inhibition m this 
assay. Accordingly, we looked for indications that 
15 the 0KT4/rsT4 mixture was more potent than rsT4 alone, 
we observed that at ratios of 0KT4:rsT4 greater than 
0 2 partial to complete inhibition of syncytium 
formation occurred. We believe that under conditions 
where two rsT4 molecules are bound to 1 OKT4 molecule, 
20 the greatest inhibitory effect should be found. 

Thus, polyvalent, as well as monovalent 
forms of recombinant soluble T4 are useful in the 
compositions and method, of this invention. 

Microorganisms and recombinant DNA mole- 
25 cules prepared by the processes of this invention 

axe exemplified by cultures deposited in the In Vitro 
international. Inc. culture collection, in Linthicum, 
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BG378 : 


E. 


coli 


MC1061/PBG378 


199-7: 


E. 


coli 


MC1061/P199-7 


170-2: 


E. 


coli 


JA221/P170-2 


EC100 : 


E. 


.coli 


JM83/PEC100 


BG377: 


E 


.coli 


MC1061/pBG377 


BG380: 


E 


.coli 


MC1061/PBG380 


BG381: 


E 


.coli 


MC1061/PBG381 



numbers IVX 10143-10149, respectiv ly. 
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BG-392 
BG-393 
BG-394 
BG-396 
203-5 



in addition, microorganisms and recombinant 
DNA molecules according to this invention are exempli 
fied by cultures deposited m the In Vitro Interna- 
tional, inc. culture collection, in Linthicum, 
Maryland, on January 6, 1986, and identified as: 
BG-391: E.coli MC1061/pBG391 
E.coli MC1061/pBG392 
E.coli MC1061/pBG393 
E.coli MC1061/pBG394 
E.coli MC1061/pBG396 
E.coli SG936/p203-5. 
These cultures were assigned accession 
numbers IVI 10151-10156, respectively. 

Microorganisms and recombinant DNA mole- 
cules according to this invention are also exempli- 
fied by cultures deposited in the In Vitro 
International, Inc. culture collection, in Linthicum, 
Maryland, on August 24, 1988 and identified as: 
211-11: E.coli A89/pBG211-ll 
E.coli A89/pBG214-10 
E.coli A89/pBG215-7 
These cultures were assigned accession 
numbers IVI 10183-10185 respectively. 

While we have hereinbefore described a 
number of embodiments of this invention, it is 
apparent that our basic constructions can be altered 
to provide othe embodiments which utilize the pro- 
cesses and compositions of this invention. There- 
fore, it will be appreciated that the scope of this 
invention is to be defined by the claims appended 
hereto rather than by the specific embodiments which 
have been presented hereinbefore by way of example. 
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CLAIMS 

We claim: 

i . A DNA sequence selected from the group 
consisting of: 

5 (a) the DNA inserts cf pl99-7, pBG377, 

pBG380, pBG381, p203-5, pBC39I, pBG392. pBG392 , pBG394. 
pBG395, pBG396, pBG397. -211-11, p214-10 and p215-7; 

(b) DNA sequences which hybridize 

to one or more of the foregoing DNA inserts and which 
10 code on expression for a soluble T4-like polypeptide; 
and 

(c) DNA sequences which code on 
expression for a soluble T4-like polypeptide coded 
for on expression by any of the foregoing DNA inserts 

15 and sequences. 

2- The DNA sequence according to claim 1, 
wherein said DNA sequence (b) codes on expression 
for a soluble T4-like polypeptide which inhibits 
adhesion between T4 + lymphocytes and infective agents 
20 which target T4* lymphocytes and which inhibits 

interaction between T4 + lymphocytes and antigen pre- 
senting cells and targets of T4 + lymphocyte mediated 
killing. 

3. A recombinant DNA molecule comprising 
25 a DNA sequence selected from the group consisting of 
th« DNA sequences of claim 1 or 2, said DNA sequence 
being operatively linked to an expression control 
sequence in said recombinant DNA molecule. 



30 



4. The recombinant DNA molecule according 
to claim 3, wherein said expression control sequence 
is s lected from the group consisting of the early 
or late promo t rs of SV40 or aden virus, the lac 
system, th tr£ system, the TAC syst m, the TRC 
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system, the major operator and promoter regions of 
phage k, the control regions of fc coat protein, the 
promoter for 3-phosphoglycerate kinase or other 
glycolytic enzymes, the promoters of acid phosphatase, 
5 the polyhedron promoter of the baculovirus system 
and the promoters of the yeast o -mating factors. 

5. A unicellular host transformed with a 
recombinant DNA molecule selected from the group 
consisting of the recombinant DNA molecules of claim 3 

10 or 4. 

6. The host according to claim 5, wherein 
said host is selected from the group consisting of 
strains of E.coli , Fseudomonas . Bacillus . 
Streptomyces , fungi, animal cells, plant cells, 

15 insect cells and human cells in tissue culture. 

7. A polypeptide coded for on expression 
by a DNA sequence selected from the group consisting 
of the DNA sequences of claim 1 or 2, said polypep- 
tide being essentially free of other proteins of 

20 human origin. 

8. The polypeptide according to claim 7, 
wherein said polypeptide is selected from the group 
consisting of a polypeptide of the formula AA_ 23 -AA 362 
of Figure 3, a polypeptide of the formula AA 1 ^ 362 of 

25 Figure 3, a polypeptide of the formula Met-AA 1-362 

of Figure 3, a polypeptide of the formula AA X ^ 374 of 
Figure 3, a polypeptide of the formula Met-AA 1-3?4 
of Figure 3, a polypeptide of the formula AA 1-3?7 of 
Figure 3, a polypeptide of the formula Met-AA 1-37? 

30 of Figure 3, a polypeptide of the formula AA_ 2 3~ AA 374 
of Figure 3, a polypeptide of the formula AA_ 23 -AA 37? 
of Figure 3. 
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9. The polypeptide according to claim 7, 
wherein said polypeptide is selected from the 
group consisting of a polypeptide of the formula 
AA 2 ,-AA 182 of Figure 16, a polypeptide of the 
5 formula AA 1 -AA 182 of Figure 16, a polypeptide of 

the formula Met-AA 1 _ 1Q2 of Figure 16, a polypeptide 
of the formula AA_ 23 -AA 182 of Figure 16. followed 
by the amino acids asparagine-leucine-glutamine- 
histidine-serine-leucine, a polypeptide of the formula 

10 AA i- AA i82 of Fi ^ ure 16 ' followed b y amino acids 

asparagine-leucine-glutamine-histidine-serine-leucine, 

a polypeptide of the formula Met-AA l-182 of Figure 16, 
followed by the amino acids asparagine-leucine- 
glutamine-histidine-serine-leucine, a polypeptide of 
15 the formula AA -23 -AA 113 of Figure 16, a polypeptide 
of the formula AA 1 -AA 113 of Figure 16, a polypeptide 
of the formula Met-AA 1-113 of Figure 16, a polypeptide 
of the formula AA -23 -AA i:L1 of Figure 16, a polypeptide 
of the formula AA^-AA^^ of Figure 16, a polypeptide 
20 of the formula Met-AA 1-111 of Figure 16, a polypep- 
tide of the formula AA_ 23 ~AA 131 of Figure 16, a poly- 
peptide of the formula AA 1 -AA 131 of Figure 16, a 
polypeptide of the formula Met-AA 1 _ 131 of Figure 16, 
a polypeptide of the formula AA -23~ AA 145 °* Fi 9U*e 16 » 
25 a polypeptide of the formula AA i~ AA i45 of Figure l 6 - 

a polypeptide of the formula Met-AA 1 _ 145 of Figure 16, 
a polypeptide of the formula ^.23*^166 of Fi 9 ure 16 ' 
a polypeptide of the formula *A 1 -AA 166 °* Figure 16, 
a polypeptide of the formula Met-AA 1-166 of Figure 16, 
30 or portions thereof. 

10. The polypeptide according to claim 7, 
wherein said polypeptide is selected from the group 
consisting of a polyp ptide of the formula A A „23" AA 362 
of mature T4 protein, a polypeptide of the formula 
35 AA 1 _ 3fi2 of mature T4 protein, a p lypeptide of the 

formula M t-AA 1-362 of mature T4 protein, a polypep- 
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tide of the formula ^^374 of mature T4 protein, a 
polypeptide of the formula Met-AA^^ of mature T4 
protein, a polypeptide of the formula AA 1-3?? of 
mature T4 protein, a polypeptide of the formula 
5 Met-AA 1-377 of mature T4 protein, a polypeptide of 

the formula AA _ 2 3" AA 374 of mature T4 protein, a poly- 
peptide of the formula AA_ 23 -AA 37? of mature T4 pro- 
tein, or portions thereof. 

11. The polypeptide according to claim 7, 

10 wherein said polypeptide is selected from the group 

consisting of a polypeptide of the formula AA _ 2 3~ AA 182 
of mature T4 protein, a polypeptide of the formula 
AA 1 -AA 182 of mature T4 protein, a polypeptide of the 
formula Met * AA 1 «i82 of mature T4 protein, a polypep- 

15 tide of the formula AA _23~ AA 182 of mature T4 protein, 
followed by the amino acids asparagine-leucine- 
glutamine-histidine-serine-leucine, a polypeptide of 
the formula A^-AA^j of mature T4 protein, followed 
by the amino acids asparagine-leucine-glutamine- 

20 histidine-serine-leucihe, a polypeptide of the formula 
Met-AA 1-182 of mature T4 protein, followed by the 
amino acids asparagine-leucine-glutamine-histidine- 
serine-leucine, a polypeptide of the formula 
AA_ 23 -AA 113 of mature T4 protein, a polypeptide of 

25 the formula AA 1 * AA 113 of mature T4 protein, a polypep- 
tide of the formula Met * AA 1 . 113 of mature T4 protein, 
a polypeptide of the formula ^.23*^111 of mature 
T4 protein, a polypeptide of the formula A A 1 -AA 111 
of mature T4 protein, a polypeptide of the formula 

30 Met * AA i-iii of mature T4 protein, a polypeptide of 

the formula ^.23*^131 of mature T4 protein, a poly- 
peptide of the formula AA 1 ~AA 131 of mature T4 protein, 
a polypeptide of the formula Met-AA 1-131 of mature 
T4 pr tein, a polypeptide of the formula ^.23*^145 

35 f mature T4 pr tein, a polypeptid f the formula 
AA 1 -AA 145 of mature T4 protein, a polyp ptid of th 
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formula Met-AA-_ 145 of mature T4 protein, a polypep- 
tide of the formula AA_ 23 -AA 166 of mature T4 protein, 
a polypeptide of the formula AA 1 -AA 166 of mature T4 
protein, a polypeptide of the formula Met-AA W66 of 
mature T4 protein, or portions thereof. 

12. A method for producing a polypeptide 
selected from the group consisting of the polypeptides 
of any one of claims 7 to 11 comprising the step of 
culturing a unicellular host transformed with a recom- 
binant DNA molecule selected from the group consisting 
of the recombinant DNA molecules of claim 3 or 4. 



13. A pharmaceutical composition comprising 
an immunotherapeutic or immunosuppressive effective 
amount of a polypeptide selected from the group con- 
15 sisting of the polypeptides of any one of claims 7 to 
11 and a pharmaceutical ly acceptable carrier. 

14. A method for treating patients com- 
prising the step of treating them in a pharmaceuti- 
cal ly acceptable manner with a composition selected 
20 from the group consisting of the composition of 
claim 13. 

IS. The method according to claim 14, 
wh«-rein the patient is treated by intramuscular 
injection of the composition. 

25 16. A diagnostic composition for detecting 

or for monitoring the course of HIV infection com- 
prising a diagnostic effective amount of a polypeptide 
selected from the group consisting of the polypeptides 
of any one of claims 7 to 11. 
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17 . A method for detecting or for moni- 
toring th course of HIV inf ction comprising the 
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step of employing as a diagnostic a composition 
selected from the group consisting of the compositions 
of claim 16. 

18. A means for detecting or for monitoring 
5 the course of HIV infection comprising a composition 

selected from the group consisting of the compositions 
of claim 16. 

19. A pharmaceutical composition compris- 
ing an immuno therapeutic or immunosuppressive effec- 

10 tive amount of antibody to a polypeptide selected 

from the group consisting of the polypeptides of any 
one of claims 7 to 11 and a pharmaceutical^ accept- 
able carrier. 

.20. A method for treating patients com- 
15 prising the step of treating them in a pharmaceuti- 
cally acceptable manner with a composition according 
to claim 19. 

21. The use of a polypeptide selected 
from the group consisting of the polypeptides of any 

20 one of claims 7 to 11 to purify HIV virus. 

22. The use according to claim 20, wherein 
the HIV virus is purified from a biological sample. 

23. A method for purifying HIV virus from 
a sample comprising the step of exposing the sample 

25 to a polypeptide selected from the group consisting 
of the polypeptides of any one of claims 7 to 11. 

24. The method according to claim 22, 
wherein the sample is a biological sample. 



940 




PCT/LS88/02*40 



-103- 

25. A DNA sequence comprising the DNA 
insert of pl70-2, said sequence coding on expression 
for a T4-like polypeptide. 

26. A recombinant DNA molecule comprising 
a DNA sequence selected from the group consisting of 
the DNA sequence of claim 25, said DNA sequence 
being operatively linked to an expression control 
sequence in said recombinant DNA molecule. 

27. A unicellular host transformed with a 
recombinant DNA molecule according to claim 26. 

28. A polypeptide coded for on expression 
by a DNA sequence of claim 25, said polypeptide being 
essentially free of other proteins of hu m an origin. 

29. A pharmaceutical composition comprising 
an immunotherapeutic or immunosuppressive amount of a 
soluble protein receptor and a pharmaceutical ly 
acceptable carrier. 

30. A method for treating patients 
comprising the step of treating t h em in a pharma- 
ceutical^ acceptable manner with a pharmaceutical 
composition of claim 29. 

31. A diagnostic composition for detecting 
or for monitoring the course of viral infection com- 
prising a diagnostic effective amount of a soluble 
protein receptor. 

32. A method for detecting or for moni- 
toring the cours of a viral inf ction comprising 
th step f employing as a diagnostic a diagnostic 
effectiv amount f a soluble pr tein recept r. 
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33. A means for detecting or for moni- 
toring the course of a viral infection comprising 
a soluble protein receptor. 

34. A DNA sequence selected from the group 
consisting of: 

(a) the DNA insert of pBiv.l; 

(b) DNA sequences which hybridize to 
the DNA insert of pBiv.l and which code on expression 
for a polyvalent soluble T4-like polypeptide; and 

(c) DNA sequences which code on 
expression for a polyvalent soluble T4-like polypep- 
tide coded for by the DNA insert of pBiv.l. 

35. A recombinant DNA molecule comprising 
a DNA sequence selected from the group consisting of 
the DNA sequences of claim 34, said DNA sequence 
being operatively linked to an expression control 
sequence in said recombinant DNA molecule. 

36. A unicellular host transformed with a 
recombinant DNA molecule according to claim 35. 

37. A polypeptide coded for on expression 
by a DNA sequence selected from the group consisting 
of the DNA sequences according to claim 34, said 
polypeptide being essentially free of other proteins 
of human origin. 

38. The polypeptide according to claim 7, 
wherein said polypeptide is polyvalent. 

39. A method for producing a polyvalent 
polypeptide comprising the steps of: 

(a) culturing a unicellular host 
transformed with a recombinant DNA molecule according 
to claim 3 or 4 to produce a polypeptide; and 
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(b) coupling said polypeptide to a 
carrier to form a polyvalent polypeptide. 

40. A DNA sequence comprising: 

(a) a first portion comprising a DNA 
sequence coding for the constant region of an immuno- 
globulin light chain; and 

(b) a second portion comprising a 
DNA sequence according to claim 1 or 2, or portions 
thereof, said second portion being joined upstream 
of said first portion* 

41. A DNA sequence comprising: 

(a) a first portion comprising a DNA 
sequence coding for the constant region of an immuno- 
globulin heavy chain; and 

(b) a second portion comprising a 
DNA sequence according to claim 1 or 2, or portions 
thereof, said second portion being joined upstream 
of said first portion. 

42. An expression vector comprising the 
DNA sequence according to claim 40. 

43. An expression vector comprising the 
DNA sequence according to claim 41. 

44. An expression vector comprising the 
DNA sequence according to claim 40 and the DNA 
sequence according to claim 41. 

45. A method for producing a chimeric 
rsT4/IgG 1 comprising the step of co- trans fecting a 
host cell with the expression vector according to 
claim 42 and the expression v ctor according to 
claim 43. 
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46. A method for producing a chimeric 
rsT4/IgG 1 comprising the step of transfecting a host 
cell with the expression vector according to claim 44. 

47. A chimeric rsT4/IgG 1 produced by the 
method according to claim 45 or 46. 

48. A pharmaceutical composition comprising 
an immunotherapeutic or immunosuppressive effective 
amount of a polypeptide according to claim 37 or 38. 

49. A method for treating patients com- 
prising the step of treating them in a pharmaceuti- 
cally acceptable manner with a composition according 
to claim 48. 

50. A diagnostic composition for detecting 
or for monitoring the course of HIV infection com* 
prising a diagnostic effective amount of a polypeptide 
according to claim 37 or 38. 

51. A pharmaceutical composition comprising 
an immunotherapeutic or immunosuppressive effective 
amount of a chimeric rsT4/IgG^ according to claim 47. 

52. A method for treating patients com- 
prising the step of treating them in a pharmaceuti- 
cally acceptable manner with a composition according 
to claim 51. 
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PB&391 C~ //T /AT 

9G368 Dackooni r /{J. I\J 

■ SO i uD i • T A#3 
: A A #3 = LVS 

og36:.sec Length: 6151 

1 GAATTAATTC CAGCTTGCTG TGGAATGTGT GTCAGTTAGG G'GTGGAAAG 
51 TCCCCAGGCT CCCCAGCAGG CAGAAGTATG CA A AGCATGC ATCTCaaTTa 
101 GTCAGCAACC AGGTGTGGAA AGTCCCCAGG CTCCCCAGCA GGCAGAAGTA 
151 TGCAAAGCAT GCATCTCAAT TAGTCAGCAA CCATAGTCCC GCCCCTAACT 
201 CCGCCCATCC CGCCCCTAAC TCCGCCCAGT TCCGCCCATT CTCCGCCCCA 
251 TGGCTGACTA ATTTTTTTTA TTTATGCAGA GGCCGAGGCC GCCTCGGCCT 
301 CTGAGCTATT CCAGAAGTAG TGAGGAGGCT TTTTTGGAGG GGTCCTCCTC 
351 GTATAGAAAC TCGGACCACT CTGAGACGAA GGCTCGCGTC CAGGCCAGCA 
401 CGAAGGAGGC TAAGTGGGAG GGGTAGCGGT CGTTGTCCAC TAGGGGGTCC 
451 ACTCGCTCCA GGGTGTGAAG ACACATGTCG CCCTCTTCGG CATCAAGGAA 
501 GGTGATTGGT TTATAGGTGT AGGCCACGTG ACCGGGTGTT CCTGAAGGGG 
551 GGCTATAAAA GGGGGTGGGG GCGCGTTCGT CCTCACTCTC TTCCGCATCG 
601 CTGTCTGCGA GGGCCAGCTG TTGGGCTCGC GGTTGAGGAC AAACTCTTCG 
651 CGGTCTTTCC AGTACTCTTG GATCGGAAAC CCGTCGGCCT CCGAACGGTA 
701 CTCCGCCACC GAGGGACCTG AGCGAGTCCG CATCGACCGG ATCGGAAAAC 
751 CTCTCGAGAA AGGCGTCTAA CCAGTCACAG TCGCAAGGTA GGCTGAGCAC 
801 CGTGGCGGGC GGCAGCGGGT GGCGGTCGGG GTTGTTTCTG GCGGAGGTGC 
851 TGCTGAT6AT GTAATTAAAG TAGGCGGTCT TGAGAC6GCG GATGGTCGAG 
901 GTGAGGTGTG GCAGGCTTGA GATCGATCTG GCCATACACT TGAGTGACAA 
951 TGACATCCAC TTTGCCTTTC TCTCCACAGG TGTCCACTCC CAGGTCCAAC 
1001 TGGATCCAAG CTTCGACTCG AGGAATTCCC CGAAGGAACA AAGCACCCTC 
1051 CCCACTGGGC TCCTGGTTGC AGAGCTCCAA GTCCTCACAC AGATACGCCT 
1101 GTTTGAGAAG CAGCGGGCAA GAAAGACGCA AGCCCAGAGG CCCTGCCATT 
1151 TCTGTGGGCT CAGGTCCCTA CTGGCTCAGG CCCCTGCCTC CCTCGGCAAG 
1201 GCCACAATGA ACCGGGGAGT CCCTTTTAGG CACTTGCTTC TGGTGCTGCA 
1251 ACTGGCGCTC CTCCCAGCAG CCACTCAGGG AAAGAAAGTG GTGCTGGGCA 

1M1 AAAAAGGGGA tacagtggaa ctgacctgta cagcttccca gaagaagagc 

1351 ATACAATTCC ACTGGAAAAA CTCCAACCAG ATAAAGATTC TGGGAAATCA 
14U1 GGGCTCCTTC TTAACTAAAG GTCCATCCAA GCTGAATGAT CGCGCTGACT 
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FIG. 15 (cont'd) 

:7TGTGGGAC CAAGGAAACT TtCCCCTGAT CATCAAGAAT 



H7/J* 

145: CAAGAAGAAG C" 
150: CTTAAGATAG AAGACTCAGA TACTTACATC TGTGAAGTGG AGGACCAGAA 
155 1 GGAGGAGGTG CAATTGCTAG TGTTCGGATT GACTGCCAAC TCTGACACCC 

i6c: ac:-gcttca ggggcagagc ctgaccctga ccttggagag cccccctgg- 
165: agtagcccct cagtgcaatg taggagtcca aggggtaaaa acatacaggg 
170 1 ggggaagacc ctctccgtgt ctcagctgga gctccaggat agtggcacct 

175 1 GGACATGCAC TGTCTTGCAG AACCAGAAGA AGGTGGAGTT CAAAATAGAC 
1801 ATCGTGGTGC TAGCTTTCCA GAAGGCCTCC AGCATAGTCT ATAAGAA AGA 
185 1 GGGGGAACAG GTGGAGTTCT CCTTCCCACT CGCCTTTACA GTTGAAAAGC 
1901 TGACGGGCAG TGGCGAGCTG TGGTGGCAGG CGGAGAGGGC TTCCTCCTCC 
195: AAGTCTTGGA TCACCTTTGA CCTGAAGAAC AAGGAAGTGT CTGTAAAACG 
2001 GGTTACCCAG GACCCTAAGC TCCAGATGGG CAAGAAGCTC CCGCTCCACC 
2051 TCACCCTGCC CCAGGCCTTG CCTCAGTATG CTGGCTCTGG AAACCTCACC 
210 1 CTGGCCCTTG AAGCGAAAAC AGGAAAGTTG CATCAGGAAG TGAACCTGGT 
215 1 GGTGATGAGA GCCACTCAGC TCCAGAAAAA TTTGACCTGT GAGGTGTGGG 
2201 GACCCACCTC CCCTAAGCTG ATGCTGAGTT TGAAACTGGA GAACAAGGAG 
2 251 GCAAAGGTCT CGAAGCGGGA GAAGGCGGTG TGGGTGCTGA ACCCTGAGGC 
2301 GGGGATGTGG CAGTGTCTGC TGAGTGACTC GGGACAGGTC CTGCTGGAAT 
2351 CCAACATCAA GGTTCTGCCC ACATGGTCGA CCCCGGTGCA GCCAATGGCC 
2401 CTGATTTGAG ATCTTTGTGA AGGAACCTTA CTTCTGTGGT GTGACATAAT 
2451 TGGACAAACT ACCTACAGAG ATTTAAAGCT CTAAGGTAAA TAT AAAATTT 
2501 TTAAGTGTAT AATGTGTTAA ACTACTGATT CTAATTGTTT GTGTATTTTA 
2551 GATTCCAACC TATGGAACTG ATGAATGGGA GCAGTGGTGG AATGCCTTTA 
2601 ATGAGGAAAA CCTGTTTTGC TCAGAAGAAA TGCCATCTAG TGATGATGAG 
265 1 GCTACTGCTG ACTCTCAACA TTCTACTCCT CCAAAAAAGA AGAGAAAGGT 
2701 AGAAGACCCC AAGGACTTTC CTTCAGAATT GCTAAGTTTT TTGAGTCATG 
2751 CTGTGTTTAG TAATAGAACT CTTGCTTGCT TTGCTATTTA CACCACAAAG 
2801 GAAAAAGCTG CACTGCTATA CAAGAAAATT ATGGAAAAAT ATTCTGTAAC 
2851 CTTTATAAGT AGGCATAACA GTTATAATCA TAACATACTG TTTTTTCTTA 
2901 CTCCACACAG GCATAGAGTG TCTGCTATTA ATAACTATGC TCAAAAATTG 
2951 TGTACCTTTA GCTTTTTAAT TTGTAAAGGG GTTAATAAGG AATATTTGAT 
3001 GTATAGTGCC TTGACTAttAG ATCATAATCA GCCATACCAC ATTTGTAGAG 
3051 GTTTTACTTG CTTT AAAAAA CCTCCCACAC CTCCQCCTGA ACCTGAAACA 
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FIG. 15 (cont'd) 

TAAAATGAAT GCAATTGTTG TTGTTAACTT GTTTATTGCA GCTTATAATG 



3io; 

315: GTTACAAATA AAGCAATAGC ATCACAAATT T C A C AAATAA AGCATT'TT' 

3201 TCACTGCATT CTAGTTGTGG TTTGTCCAAA CTCATCAATG TAT C TTAT C A 

325; tgtc-GGA'C CTCTACGCCG GACGCATCGT GGCCGGCATC ACCGGCGCCA 

33C1 CAGGTGCGGT TGCTGGCGCC TATATCGCCG ACATCACCGA TGGGGAAGAT 

3351 CGGGCTCGCC ACTTCGGGCT CATGAGCGCT TGTTTCGGCG TGGGTaTGGT 

3401 GGCAGGCCCG TGGCCGGGGG ACTGTTGGGC GCCATCTCCT TGCATGCACC 

3451 ATTCCTTGCG GCGGCGGTGC TCAACGGCCT CAACCTACTA CTGGGCTGCT 

3501 TCCTAATGCA GGAGTCGCAT AAGGGAGAGC GTCGACCGAT GCCCTTGAGA 

3551 GCCTTCAACC CAGTCAGCTC CTTCCGGTGG GCGCGGGGCA TGACTATCGT 

3601 CGCCGCACTT ATGACTGTCT TCTTTATCAT GCAACTCGTA GGACAGGTGC 

365 1 CGGCAGCGCT CTGGGTCATT TTCGGCGAGG ACCGCTTTCG CTGGAGCGCG 

3701 ACGATGATCG GCCTGTCGCT TGCGGTATTC GGAATCTTGC ACGCCCTCGC 

3751 TCAAGCCTTC GTCACTGGTC CCGCCACCAA ACGTTTCGGC GAGAAGCAGG 

3801 CCATTATCGC CGGCATGGCG GCCGACGCGC TGGGCTACGT CTTGCTGGCG 

3851 TTCGCGACGC GAGGCTGGAT GGCCTTCCCC ATTATGATTC TTCTCGCTTC 

3901 CGGCGGCATC GGGATGCCCG CGTTGCAGGC CATGCTGTCC AGGCAGGTAG 

3951 ATGACGACCA TCAGGGACAG CTTCAAGGAT CGCTCGCGGC TCTTACCAGC 

4001 CTAACTTCGA TCACTGGACC GCTGATCGTC ACGGCGATTT ATGCCGCCTC 

405 1 GGCGAGCACA TGGA ACGGGT TGGCATGGAT TGTAGGCGCC GCCCTATACC 

4 101 TTGTCTGCCT CCCCGCGTTG CGTCGCGGTG CATGGAGCCG GGCCACCTCG 

4151 ACCTGAATGG AAGCCGGCGG CACCTCGCTa ACGGATTCAC CACTCCAAGA 

4201 ATTGGAGCCA ATCAATTCTT GCGGAGAACT GTGAATGCGC AAACCAACCC 

4251 TTGGCAGAAC ATATCCATCG CGTCCGCCAT CTCCAGCAGC CGCACGCGGC 

4301 GCATCTCGGG CCGCGTTGCT GGCGTTTTTC CAT AGGCTCC GCCCCCCTGA 

4351 CGAGCATCAC AAAAATCGAC GCTCAAGTCA GAGGTGGCGA AACCCGACAG 

4401 GACTATAAAG AT ACCAGGCG TTTCCCCCTG GAAGCTCCCT CGTGCGCTCT 

4451 CCTGTTCCGA CCCTGCCGCT TACCGGATAC CTGTCCGCCT TTCTCCCTTC 

4501 GGGAAGCGTG GCGCTTTCTC AATGCTCAC6 CTGTAGGTAT CTCAGTTCGG 

4551 TGTAGGTCGT TCGCTCCAAG CTGGGCTGTG TGCACGAACC CCCCGTTCAG 

4601 CCCGACCGCT GCGCCTTATC CGGTAACTAT CGTCTTGAGT CCAACCCGGT 

4651 AAGACACGAC TTATCGCCAC TGGCAGCAGC CACTGGTAAC AGGATTAGCA 

4701 GAGCGAGGTA TGTAGGCGGT GCTACAGAGT TCTTGAAGTG GTGGCCTAAC 
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F I G./5 (cont'd) 

4751 T A CGG " A C a C T A G A A GG A C AGT ATTTGGT A T C T GC GC T C TGCTGA AGC C 
4801 AG^TACCTTC GG A AAA A GAG TTGGT AGCTC TTGA T C CGGC AAACAAACCA 
485 1 CCGCTGGTAG CGGTGGTTTT TTTGTTTGC A AGCAGCAGAT TACGCGCAGa 

49C1 AAAAAAGGAT ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc 
4951 tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa 

5001 AAAGGATCTT CACCTAGATC CTTTTAAATT AAAAATGAAG TTTTAAATCA 
505 1 ATCTAAAGTA TATATGAGTA AACTTGGTCT GACAGTTACC AATGCTTAAT 
5 101 CAGTGAGGCA CCT AT CTC AG CGATCTGTCT ATTTCGTTCA TCC AT AGTTG 
5 15 1 CCTGACTCCC CGTCGTGT AG ATAACTACGA TACGGGAGGG CTTACCATCT 
5201 GGCCCCAGTG CTGCA ATGAT ACCGCGAGAC CCACGCTCAC CGGCTCCAGA 
5251 TTTATCAGCA AT A A ACC AGC CAGCCGGAAG GGCCGAGCGC AGAAGTGGTC 
5301 CTGCAACTTT ATCCGCCTCC ATCCAGTCTA TTAATTGTTG CCGGGAAGCT 
5251 AGAGTAAGTA GTTCGCCAGT T AATAGTTTG CGCAACGTTG TTGCCATTGC 
5401 TGCAGGCATC GTGGTGTCAC GCTCGTCGTT TGGTATGGCT TCATTCAGCT 
S451 CCGGTTCCCA ACGAT C AAGG CGAGTTACAT GATCCCCCAT GTTGTGCAAA 
550 1 AAAGCGGTTA gctccttcgg TCCTCCGATC GTTGTCAGAA GTAAGTTGGC 
555 1 CGCAGTGTTA TCACTCATGG TTATGGCAGC ACTGCATAAT TCTCTTACTG 
5601 TCATGCCATC CGTA AGATGC TTTTCTGTGA CTGGTGAGTA CTCAACCAAG 
5651 TCATTCTGAG AAT AGTGT A T GCGGCGACCG AGTTGCTCTT GCCCGGCGTC 
5701 AACACGGGAT A AT A CCGCGC CACATAGCAG AACTTTAAAA GTGCTCATCA 
5751 TTGGAAAACG TTCTTCGGGG CGA AAACTCT CAAGGATCTT ACCGCTGTTG 
5801 AGATCCAGTT CGATGT A AC C CACTCGTGCA CCCA ACTGAT CTTCAGCATC 
585 1 TTTTACTTTC AC C AGCGTTT CTGGGTGAGC A A A AACAGGA AGGCA A A ATG 
5901 C CGC A A A AAA GGGAATAAGG GCGACACGGA AATGTTGAAT ACTCATACTC 
5951 TTCCTTTTTC AAT ATT ATTG AAGCATTT AT CAGGGTT ATT GTCTCATGAG 
6001 CGGATACATA TTTGA ATGTA TTTAGAAAAA TAAACAAATA GGGGTTCCGC 
605 1 GCACATTTCC CCGAAAAGTG CCACCTGACG TCTAAGAAAC CATTATTATC 
6101 ATGACATTAA CCT AT AAAA A TAGGCGTATC ACGAGGCCCT TTCGTCTTCA 
6151 A 



SUBSTITUTE SHEET 



WO 89/01940 



PCT/LS88/02940 



.BG366 Dackoon 

:50lu0)e T4#7 
: A A #3 = LVS 
: I62AA-6AA 
:^Ofn 2C3-5 



Dg392.sec Length: 6149 



1 


GAAT7AATTC 


CAGCTTGCTG 


51 


TCCCCAGGCT 


CCCCAGCAGG 


101 


GTCAGCAACC 


AGGTGTGGAA 


151 


TGCAAAGCAT 


GCATCTCAAT 


201 


CCGCCCATCC 


CGCCCCTAAC 


251 


TGGCTGACTA 


ATTTTTTTTA 


301 


CTGAGCTATT 


CCAGAAGTAG 


351 


GTATAGAAAC 


TCGGACCACT 


401 


CGAAGGAGGC 


TAAGTGGGAG 


451 


ACTCGCTCCA 


GGGTGTGAAG 


501 


GGTGATTGGT 


TTATAGGTGT 


551 


GGCTATAAAA 


GGGGGTGGGG 


601 


CTGTCTGCGA 


GGGCCAGCTG 


651 






701 


CTCCGCCACC 


GAGGGACCT& 


751 


CTCTCGAGAA 


AGGCGTCTAA 


801 


CGTGGCGGGC 


GGCAGCGGGT 


851 


TGCTGATGAT 


GTAATTAAAG 


901 


GTGAGGTGTG 


GCAGGCTTGA 


951 


TGACATCCAC 


TTTGCCTTTC 


1001 


TGGATCCAAG 


CTTCGACTCG 


1051 


CCCACTGGGC 


TCCTGGTTGC 


1101 


GTTTGAGAAG 


CAGCGGGCAA 


1151 


TCTGTGGGCT 


CAGGTCCCTA 




MET 




1201 


GCCAC4AT& 


ACCGGGGAGT 






1251 


ACTGGCGCTC 


CTCCCAGCAG 


1301 


AAAAAGGGGA 


TACAGTGGAA 


1351 


ATACAATTCC 


ACTGGAAAAA 



FIG. 16 



TGGAATGTGT GTCAGTTAGG GTGTGGAAAG 
CAGAAGTATG CAAAGCATGC ATCTCAATTA 
AGTCCCCAGG CTCCCCAGCA GGCAGAAGTA 
TAGTCAGCAA CCATAGTCCC GCCCCTAACT 
TCCGCCCAGT TCCGCCCATT CTCCGCCCCA 
TTTATGC.AGA GGCCGAGGCC GCCTCGGCCT 
TGAGGAGGCT TTTTTGGAGG GGTCCTCCTC 
CTGAGACGAA GGCTCGCGTC CAGGCCAGCA 
GGGTAGCGGT CGTTGTCCAC TAGGGGGTCC 
ACACATGTCG CCCTCTTCGG CATCAAGGAA 
AGGCCACGTG ACCGGGTGTT CCTGAAGGGG 
GCGCGTTCGT CCTCACTCTC TTCCGCATCG 
TTGGGCTCGC GGTTGAGGAC AAACTCTTCG 
GATCGGAAAC CCGTCGGCCT CCGAACGGTA 
AGCGAGTCCG CATCGACCGG ATCGGAAAAC 
CCAGTCACAG TCGCAAGGTA GGCTGAGCAC 
GGCGGTCGGG GTTGTTTCTG GCGGAGGTGC 
TAGGCGGTCT TGAGACGGCG GATGGTCGAG 
GATCGATCTG GCCATACACT TGAGTGACAA 
TCTCCACAGG TGTCCACTCC CAGGTCCAAC 
AGGAATTCCC CGAAGGAACA AAGCACCCTC 
AGAGCTCCAA GTCCTCACAC AGATACGCCT 
GAAAGACGCA AGCCCAGAGG CCCTGCCATT 
CTGGCTCAGG CCCCTGCCTC CCTCGGCAAG 
CCCTTTTAGG CACTTGCTTC TGGTGCTGCA 
ccact£a3gg AAAGAAAGTG GTGCTGGGCA 
CTGACCTGTA CAGCTTCCCA GAAGAAGAGC 
CTCCAACCAG ATAAAGATTC TGGGAAATCA 
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51/92 


F/i 








140: 


GGGCTCCTTC 


/ / \ 
TTAACTAAAG 


-7. /oiuur 

GTCCATCCAA 


IT Uj 

GCTGAATGAT 


CGCGCTGACT 


:45: 


CAAGAAGAAG 


CTTGTGGGAC 


CAAGGAAACT 


TTCCCCTGAT 


CATCAAGAAT 


150: 


CTTAAGATAG 


AAGACTCAGA 


TACTTACATC 


TGTGAAGTGG 


AGGACCAGAA 


155: 


GGAGGAGGTG 


CAATTGCTAG 


TGTTCGGATT 


GACTGCCAAC 


TCTGACACCC 


160: 


ACCTGCTTCA 


GGGGCAGAGC 


CTGACCCTGA 


CCTTGGAGAG 


CCCCCCTGGT 


1651 


AGTAGCCCCT 


CAGTGCAATG 


TAGGAGTCCA 


AGGGGTAAAA 


ACATACAGGG 


1701 


GGGGAAGACC 


CTCTCCGTGT 


CTCAGCTGGA 


GCTCCAGGAT 


AGTGGCACCT 


1751 


GGACATGCAC 


TGTCTTGCAG 


AACCAGAAGA 


AGGTGGAGTT CAAAATAGAC 
STOP 

CATAGTCTAfTAAfeAAAGAGG 


1801 


ATCGTGGTGC 


TAGCTTTCCA 


GAACCTCCAG 


1851 


GGGAACAGGT 


GGAGTTCTCC 


TTCCCACTCG 


CCTTTACAGT 


TGAAAAGCTG 


1901 


ACGGGCAGTG 


GCGAGCTGTG 


GTGGCAGGCG 


GAGAGGGCTT 


CCTCCTCCAA 


1951 


GTCTTGGATC 


ACCTTTGACC 


TGAAGAACAA 


GGAAGTGTCT 


GTAAAACGGG 


2001 


TTACCCAGGA 


CCCTAAGCTC 


CAGATGGGCA 


AGAAGCTCCC 


GCTCCACCTC 


2051 


ACCCTGCCCC 


AGGCCTTGCC 


TCAGTATGCT 


GGCTCTGGAA 


ACCTCACCCT 


2101 


GGCCCTTGAA 


GCGAAAACAG 


GAAAGTTGCA 


TCAGGAAGTG 


AACCTGGTGG 


2151 


TGATGAGAGC 


CACTCAGCTC 


CAGAAAAATT 


TGACCTGTGA 


GGTGTGGGGA 


2201 


CCCACCTCCC 


CTAAGCTGAT 


GCTGAGTTTG 


AAACTGGAGA 


ACAAGGAGGC 


2251 


AAAGGTCTCG 


AAGCGGGAGA 


AGGCGGTGTG 


GGTGCTGAAC 


CCTGAGGCGG 


2301 


GGATGTGCCA 


GTGTCTGCTG 


AGTGACTCGG 


GACAGGTCCT 


GCTGGAATCC 


2351 


AACATCAAGG 


TTCTGCCCAC 


ATGGTC6ACC 


CCGGTGCAGC 


CAATGGCCCT 


2401 


GATTTGAGAT 


CTTTGTGAAG 


GAACCTTACT 


TCTGTGGTGT 


GACATAATTG 


2451 


GACAAACTAC 


CTACAGAGAT 


TTAAAGCTCT 


AAGGTAAATA 


TAAAATTTTT 


2501 


AAGTGTATAA 


TGTGTTAAAC 


TACT6ATTCT 


AATTGTTTGT 


GTATTTTAGA 


2551 


TTCCAACCTA 


TGGAACTGAT 


GAATGGGAGC 


AGTGGTGGAA 


TGCCTTTAAT 


2601 


GAGGAAAACC 


TGTTTTGCTC 


AGAAGAAATG 


CCATCTAGTG 


ATGATGAGGC 


2651 


TACTGCTGAC 


TCTCAACATT 


CTACTCCTCC 


AAAAAAGAAG 


AGAAAGGTAG 


2701 


AAGACCCCAA 


GGACTTTCCT 


TCAGAATTGC 


TAAGTTTTTT 


GAGTCATGCT 


2751 


GTGTTTAGTA 


ATAGAACTCT 


TGCTTGCTTT 


GCTATTTACA 


CCACAAAGGA 


2801 


AAAAGCTGCA 


CTGCTATACA 


AGAAAATTAT 


GGAAAAATAT 


TCTGTAACCT 


2851 


TTATAAGTAG 


GCATAACAGT 


TATAATCATA 


ACATACTGTT 


TTTTCTTACT 


2901 


CCACACAGGC 


ATAGAGTGTC 


TGCTATTAAT 


AACTATGCTC 


AAAAATTGTG 


2951 


TACCTTTAGC 


TTTTTAATTT 


GTAAAGGGGT 


TAATAAGGAA 


TATTTGATGT 


3001 


ATAGTGCCTT 


GACTAGAGAT 


CATAATCAGC 


CATACCACAT 


TTGTAGAGGT 
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3051 


TTTACTTGCT 


3101 


AAAT&AATGC 


3151 


TACAAATAAA 


3201 


ACTGCATTCT 


3251 


TCTGGATCCT 


3301 


GGTGCGGTTG 


3351 


GGCTCGCCAC 


3401 


CAGGCCCGTG 


3451 


TCCTTGCGGC 


3501 


CTAATGCAGG 


3551 


CTTCAACCCA 


3601 


CCGCACTTAT 


3651 


GCAGCGCTCT 


3701 


GATGATCGGC 


3751 


AAGCCTTCGT 


3801 


ATTATCGCCG 


3851 


CGCGACGCGA 


3901 


GCGGCATCGG 


3951 


GACGACCATC 


4001 


AACTTCGATC 


4051 


CGAGCACATG 


4101 


GTCTGCCTCC 


4151 


CTGAATGGAA 


4201 


TGGAGCCAAT 


4251 


GGCAGAACAT 


4301 


AT CT CGGGCC 


4351 


AGCATCACAA 


4401 


CTATAAAGAT 


4451 


TGTTCCGACC 


4501 


GAAGCGTGGC 


4551 


TAG6TCGTTC 


4601 


CGACCGCTGC 


4651 


GACACGACTT 
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FIG. l6(cont'd) 

AAAAAACC TCCCACACCT 'cCC 



TTAAAAAACC 


TCCCACACCT 


CCCCCTGAAC 


CTGAAACATA 


AATTGTTGTT 


GTTAACTTGT 


TTATTGCAGC 


TTATAATGGT 


GCAATAGCAT 


CACAAATTTC 


AC A A AT A A AG 


CATTTTTTTC 


AGTTGTGGTT 


TGTCCAAACT 


CATCAATGTA 


TCTTATCATG 


CTACGCCGGA 


CGCATCGTGG 


CCGGCATCAC 


CGGCGCCACA 


CTGGCGCCTA 


TATCGCCGAC 


ATCACCGATG 


GGGAAGATCG 


TTCGGGCTCA 


TGAGCGCTTG 


TTTCGGCGTG 


GGTATGGTGG 


GCCGGGGGAC 


TGTTGGGCGC 


CATCTCCTTG 


CATGCACCAT 


GGCGGTGCTC 


AACGGCCTCA 


ACCTACTACT 


GGGCTGCTTC 


AGTCGCATAA 


GGGAGAGCGT 


CGACCGATGC 


CCTTGAGAGC 


GTCAGCTCCT 


TCCGGTGGGC 


GCGGGGCATG 


ACTATCGTCG 


GACTGTCTTC 


TTTATCATGC 


AACTCGTAGG 


ACAGGTGCCG 


GGGTCATTTT 


CGGCGAGGAC 


CGCTTTCGCT 


GGAGCGCGAC 


CTGTCGCTTG 


CGGTATTCGG 


AATCTTGCAC 


GCCCTCGCTC 


CACTGGTCCC 


GCCACCAAAC 


GTTTCGGCGA 


GAAGCAGGCC 


GCATGGCGGC 


CGACGCGCTG 


GGCTACGTCT 


TGCTGGCGTT 


GGCTGGATGG 


CCTTCCCCAT 


TATGATTCTT 


CTCGCTTCCG 


GATGCCCGCG 


TTGCAGGCCA 


TGCTGTCCAG 


GCAGGTAGAT 


AGGGACAGCT 


TCAAGGATCG 


CTCGCGGCTC 


TTACCAGCCT 


ACTGGACCGC 


TGATCGTCAC 


GGCGATTTAT 


GCCGCCTCGG 


GAACGGGTTG 


GCATGGATTG 


TAGGCGCCGC 


CCTATACCTT 


CCGCGTTGCG 


TCGCGGTGCA 


TGGAGCCGGG 


CCACCTCGAC 


GCCGGCGGCA 


CCTCGCTAAC 


GGATTCACCA 


CTCCAAGAAT 


CAATTCTTGC 


GGAGAACTGT 


GAATGCGCAA 


ACCAACCCTT 


ATCCATCGCG 


TCCGCCATCT 


CCAGCAGCCG 


CACGCGGCGC 


GCGTTGCTGG 


CGTTTTTCCA 


TAGGCTCCGC 


CCCCCTGACG 


AAATCGACGC 


TCAAGTCAGA 


GGTGGCGAAA 


CCCGACAGGA 


ACCAGGCGTT 


TCCCCCTGGA 


AGCTCCCTCG 


TGCGCTCTCC 


CTGCCGCTTA 


CCGGATACCT 


GTCCGCCTTT 


CTCCCTTCGG 


GCTTTCTCAA 


TGCTCACGCT 


GTAGGTATCT 


CAGTTCGGTG 


GCTCCAAGCT 


GGGCTGTGTG 


CAC6AACCCC 


CCGTTCAGCC 


GCCTTATCCG 


GTAACTATC6 


TCTT6AGTCC 


AACCCGGTAA 


ATCGCCACTG 


GCAGCAGCCA 


CTGGTAACAG 


GATTAGCAGA 
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4701 GCOAGGTATG TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA 
475: CGGCTACACT AGAAGGACAG TATTTGGTAT ctgcgctctg CTGAAGCCAG 
4801 TTACCTTCGG AAA A AGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC 
4851 GCTGGTAGCG GTGGTTTTTT TGTTTGCAAG CAGCAGATTA CGCGCAGA A A 
4901 AAAAGGATCT CAAGAAGATC CTTTGATCTT TTCTACGGGG TCTGACGCTC 
4951 AGTGGAACGA AAACTCACGT TAAGGGATTT TGGTCATGAG ATTATCAAAA 
5001 AGGATCTTCA CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT 
5051 CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTACCAA TGCTTAATCA 
5101 GTGAGGCACC TATCTCAGCG ATCTGTCTAT TTCGTTCATC CATAGTTGCC 
5151 TGACTCCCCG TCGTGTAGAT AACTACGATA CGGGAGGGCT TACCATCTGG 
5201 CCCCAGTGCT GCAATGATAC CGCGAGACCC ACGCTCACCG GCTCCAGATT 
5251 TATCAGCAAT AAACCAGCCA GCCGGAAGGG CCGAGCGCAG AAGTGGTCCT 
5301 GCAACTTTAT CCGCCTCCAT CCAGTCTATT AATTGTTGCC GGGAAGCTAG 
5351 AGTAAGTAGT TCGCCAGTTA ATAGTTTGCG CAACGTTGTT GCCATTGCTG 
5401 CAGGCATCGT GGTGTCACGC TCGTCGTTTG GTATGGCTTC ATTCAGCTCC 
5451 GGTTCCCAAC GATCAAGGCG AGTTACATGA TCCCCCATGT TGTGCAAAAA 
5501 AGCGGTTAGC TCCTTCGGTC CTCCGATCGT TGTCAGAAGT AAGTTGGCCG 
5551 CAGTGTTATC ACTCATGGTT ATGGCAGCAC TGCATAATTC TCTTACTGTC 
5601 ATGCCATCCG TAAGATGCTT TTCTGTGACT GGTGAGTACT CAACCAAGTC 
5651 ATTCTGAGAA TAGTGTATGC GGCGACCGAG TTGCTCTTGC CCGGCGTCAA 
5701 CACGGGATAA TACCGCGCCA CATAGCAGAA CTTTAAAAGT GCTCATCATT 
5751 GGAAAACGTT CTTCGGGGCG AAAACTCTCA AGGATCTTAC CGCTGTTGAG 
5801 ATCCAGTTCG ATGTAACCCA CTCGTGCACC CAACTGATCT TCAGCATCTT 
5B51 TTACTTTCAC CAGCGTTTCT GGGTGAGCAA AAACAGGAAG GCAAAATGCC 
5901 GCAAAAAAGG GAATAAGGGC GACACGGAAA TGTTGAATAC TCATACTCTT 
5951 CCTTTTTCAA TATTATTGAA GCATTTATCA GGGTTATTGT CTCATGAGCG 
6001 GATACATATT TGAATGTATT TAGAAAAATA AACAAATAGG GGTTCCGCGC 
6051 ACATTTCCCC GAAAAGTGCC ACCTGACGTC TAAGAAACCA TTATTATCAT 
6101 GACATTAACC TATAAAAATA GGCGTATCAC 6AGGCCCTTT CGTCTTCAA 
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C2 55/53 
3 ' cta oct rrr cca gtc a 3 

FIG. 18 

62 

5* CAT CTC ACT CCA AAC 3 ' 

63 

5' CCC CTC ATA CTA A 3* 



Si 

5' CAT CTT ACT ATC A 3* 



67 

3* cCCCa«CCCTCACCCTCACCTTCCA6WCCCC3' 



SI 

3' CCC CCC CCC TCT CCA ACC TCA CCC TCA CCC TCT C 3* 
5' CCC CCT ACT ACC CCC TCA CTC CAA TCA 3' 

70 

3' CAT CTC ATT CCA CTC ACC CCC TAC TAC 3' 

s * ccr AfitAacccTCACTCCAATCtAccAcrcs' 



22 

3' TAG CAC TOe TAC ATT CCA CTC ACC CCC TAC TAC 3 



^S* CTACC*CTAAAAACATACAGCCCCCCAAGACCTCA3* 



~V CAT CTC ACC TCT TTC CCC CCC TCT ATC TTT TTA CCC 3 ' 

s . CCA CCA TAC TCC CAC CTC CAC ATC CAC TCT CTT CCA 
CAA CTC A 3' 



76 ACA CAC TCC ATC TCC ACC TCC 



3' CAT CTC ACT TCT CCA 
CAC TAT CCT CCA CCT 3' 
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S ' Fl G. 19 

AA 92 = L v i 

•irst 113 a; c * Ti 
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og394.se:: uengtr.: 5365 

i GAA'TAATTC CAGCTTGCTG TGGA ATGTGT GTCAGTTAGG GTGTGGAAAG 
5 1 TCCCCAGGCT CCCCAGCAGG C AGAAGT ATG CAAAGC ATGC ATCTCAATTa 
101 GTCAGCAACC AGGTGTGGAA AGTCCCCAGG CTCCCCAGCA GGCAGAAGT A 
15 1 TGCAAAGCAT GCATCTCA AT TAGTCAGCAA CCATAGTCCC GCCCCTAACT 
202 CCGCCCATCC CGCCCCTAAC TCCGCCCAGT TCCGCCCATT CTCCGCCCCA 
25: TGGCTGACTA ATTTTTTTTA TTTATGCAGA GGCCGAGGCC GCCTCGGCCT 
30 1 CTGAGCTATT CC AGAAGT AG TGAGGAGGCT TTTTTGGAGG GGTCCTCCTC 
35 1 GTATAGAAAC TCGGACC ACT CTGAGACGAA GGCTCGCGTC CAGGCCAGCA 
40 1 CGAAGGAGGC TAAGTGGGAG GGGTAGCGGT CGTTGTCCAC TAGGGGGTCC 
45 1 ACTCGCTCCA GGGTGTGAAG ACACATGTCG CCCTCTTCGG CATC A AGGA A 
50) GGTGATTGGT TTAT AGGTGT AGGCCACGTG ACCGGGTGTT CCTGAAGGGG 
55 1 GGCTATAAAA GGGGGTGGGG GCGCGTTCGT CCTCACTCTC TTCCGCATCG 
60 1 CTGTCTGCGA GGGCCAGCTG TTGGGCTCGC GGTTGAGGAC AAACTCTTCG 
65 1 CGC.TCTTTCC AGTACTCTTG GATCGGAAAC CCGTCGGCCT CCGAACGGTA 
7C: CTCCGCCACC GAGGGACCTG AGCGAGTCCG CATCGACCGG ATCGGAA AAC 
75 1 CTCTCGAGAA AGGCGTCTAA CCAGTCAC AG TCGCAAGGTA GGCTGAGCAC 
801 CGTGGCGGGC GGCAGCGGGT GGCGGTCGGG GTTGTTTCTG GCGGAGGTGC 
851 TGCTGATGAT GTAATTAAAG TAGGCGGTCT TGAGACGGCG GATGGTCGAG 
901 GTGAGGTGTG GC AGGCTTGA GATCGATCTG GCCAT ACACT TGAGTGACAA 
951 TGACATCCAC TTTGCCTTTC T CT C C A C AGG TGTCCACTCC CAGGTCCAAC 
1UU1 TGGA^CCAAG CTTCGACTCG AGGAATTCCC CGAAGGAACA AAGCACCCTC 
105 1 CCCACTGGGC TCCTGGTTGC AGAGCTCCAA GTCCTCACAC AGATACGCCT 
1101 GTTTGAGAAG CAGCGGGCAA GAAAGACGCA AGCCCAGAGG CCCTGCCATT 
1151 TCTGTGGGCT CAGGTCCCTA CTGGCTCAGG CCCCTGCCTC CCTCGGCAAG 
120 r GCCACAATGA ACCGGGGAGT CCCTTTTAGG CACTTGCTTC TGGTGCTGCA 
1251 ACTGGCGCTC CTCCCAGCAG CCACTCAGGG AAAGAAAGTG GTGCTGGGCA 
130 1 a a a a AGGGGA TACAGTGGAA ctgacctgta cagcttccca GAAGAAGAGC 

135 1 ATACAATTCC ACTGGAAAAA CTCCAACCAG ATAAAGATTC TGGGAAATCA 
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14L : iGGCTCC": TTAACTAAAG gtccatccaa gctgaatgat cgcgctgact 

1451 CAACitGiiG CTTG-GGGAC CAAG&AAACT TTCCCCTGAT CAT CAAGA AT 

150: C"iiGt'iG AA&ACTCAGA TACTTACATC tgtgaagtgg aggaccagaa 

1551 GG^GGAGGTG CaaTTGCTAG TGTTCGGATT GACTGCCAAC TCTGACACCC 

1601 ACCTGCTTCA GGGGTGATAG TAAGATCTTT GTGAAGGAAC CTTACTTCTG 

1651 TGGTGTGACA TAATTGGACA AACTACCTAC AGAGATTTAA AGCTCTAAGG 

170 1 TAAATATAAA ATTTTTAAGT GTATAATGTG TTAAACTACT GATTCTAATT 

1751 GTT-GTGTAT TTTAGATTCC AACCTATGGA ACTGATGAAT GGGAGCAGTG 

1U01 GTGGAATGCC TTTAATGAGG AAAACCTGTT TTGCTCAGAA GAAATGCCAT 

185 1 CTAGTGATGA TGAGGCTACT GCTGACTCTC AACATTCTAC TCCTCCAAAA 

1901 AAGAAGAGAA AGGTAGAAGA CCCCAAGGAC TTTCCTTCAG AATTGCTAAG 

1951 TTTTTTGAGT CATGCTGTGT TTAGTAATAG AACTCTTGCT TGCTTTGCTA 

2001 tttacaccac A AAGGAAAAA GCTGCACTGC TATACAAGAA AATTATGGAA 

205 1 AAATATTCTG TAACCTTTAT AAGTAGGCAT AACAGTTAT A ATCATAACAT 

2lL'l ACTGTTTTTT CTTACTCCAC ACAGGCATAG AGTGTCTGCT ATTAATAACT 

2 15 1 ATGCTCAAAA ATTQTGTACC TTTAGCTTTT TAATTTGTAA AGGGGTTAAT 

2201 AAGGAATATT TGATGTATAG TGCCTTGACT AGAGATCATA ATCAGCCATA 

1'251 CCACATTTGT AGAGGTTTTA CTTGCTTTAA AAAACCTCCC ACACCTCCCC 

;3C'l ctgaacctga AAC ATAAAAT gaatgcaatt gttgttgtta acttgtttat 

235 1 TGCAGCTTAT AATGGTTACA AATAAAGCAA TAGCATCACA AATTTCACAA 

2401 ATAAAGC ATT TTTTTCACTG CATTCTAGTT GTGGTTTGTC CAAACTCATC 

245 1 AATGTATCTT ATC ATGTCTG GATCCTCTAC GCCGGACGCA TCGTGGCCGG 

2501 CATCACCGGC GCCACAGGTG CGGTTGCTGG CGCCTATATC GCCGACATCA 

255 1 CCGATG&GGA AGATCGGGCT CGCCACTTCG GGCTCATGAG CGCTTGTTTC 

2601 GGCGTGGGTA TGGTGGCAGG CCCGTGGCCG GGGGACTGTT GGGCGCCATC 

2651 -rCCTTGCATG CACCATTCCT T GCGGCGGCG GTGC T C A A CG GCCTCAACCT 

2701 ACTACTGGGC TGCTT CCT AA TGC AGGAGTC GCAT AAGGGA GAGCGTCGAC 

275 1 CGATGCCCTT GAGAGCCTTC A ACCCAGTCA GCTCCTTCC& GTGGGCGCGG 

2801 GGCATGACTA TCGTCGCCGC ACTTATGACT GTCTTCTTTA TCATGCAACT 

2851 CGTAGGACAG GTGCCGGCAG CGCTCTGGGT CATTTTCGGC GAGGACCGCT 
2901 TTCGCTGGAG CGCGACGATG ATCGGCCTGT CGCTTGCG6T ATTCGGAATC 
2951 TTGCACGCCC TCGCTCAAGC CTTCGTCACT GGTCCCGCCA CCAAACGTTT 
juO: CGGCGAGAAG CAGGCCATTA TCGCCGGCAT GGCGGCCGAC GCGCTGGGCT 
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305 1 ACG"". iC T GGCGTTCGCG ACGCGAGGCT GGATGGCCTT CCCCATTA7G 

.... c--::ggcgg catcgggatg cccgcgttgc aggccatgc* 



3i£: g"::aggc6G gtagatgacg accatcaggg acagcttcaa ggatcgctcg 
32c: :gg:-:"'C cagcctaac tcgatcactg gaccgctgat cgtcacggcg 
32£ . A ---i-G::G cctcggcgag cacatggaac gggttggcat ggattgtagg 

33C: CGCCGCCITA JACCTTGTCT GCCTCCCCGC GTTGCGTCGC GGTGCATGGA 
335 1 GCCGGGCCAC CTCGACCTGA ATGGAAGCCG GCGGCACCTC GCTAACGGAT 
340 1 TCACCACTCC AAGAATTGGA GCC AATCAAT TCTTGCGGAG AACTGTGAAT 
V45 1 GCGCaaaCCA ACCCTTGGCA GAACATATCC ATCGCGTCCG CCATCTCCAG 
350 1 CAGCCGCACG CGGCGCATCT CGGGCCGCGT TGCTGGCGTT TTTCCATAGG 
355 1 C'CCGCCCCC CTGACGAGCA TCACAAAAAT CGACGCTCAA GTCAQAGGTG 
360 1 GCGAAACCCG ACAGGACTAT AAAGATACCA GGCGTTTCCC CCTGGAAGCT 
3651 CCC-CGTGCG CTCTCCTGTT CCGACCCTGC CGCTTACCGG ATACCTGTCC 
370 1 GCCTTTCTCC CTTCGGGAAG CGTGGCGCTT TCTCAATGCT CACGCTGTAG 
375 1 GTATCTCAGT' TCGGTGTAGG TCGTTCGCTC CAAGCTGGGC TGTGTGCACG 
3b0 1 aaCCCCCCGT TCA6CCCGAC CGCTGCGCCT TATCCGGTAA CTATCGTCTT 
385 1 GAG7CCAACC CGGT AAGACA CGACTTATCG CCACTGGCAG CAGCCACTGG 
3901 TAACAGGATT AGCAGAGCGA GGTATGTAGG CGGTGCTACA GAGTTCTTGA 
J951 AGTGGTGGCC TAACTACGGC TACACTAGAA GGACAGTATT TGGTATCTGC 
mOOI gctctgctga AGCCAGTTAC cttcggaaaa agagttggta gctcttgatc 
405 1 CGGCaaaCAA ACCACCGCTG GTAGCGGTGG TTTTTTTGTT TGCAAGCAGC 
4 101 AGATTACGCG CAGAAAAAAA GGATCTCAAG AAGATCCTTT GATCTTTTCT 
4 151 ACGGGGTCTG ACGCTCAGTG GAACGAAAAC TCACGTTAAG GGATTTTGGT 
4201 CA7GAGATTA TC AAA AAGGA TCT7CACCTA GATCCTTTTA AATTAAAAAT 
4251 GAAGTTTTAA ATCAATCTAA AGTATATATG AGTAAACTTG GTCTGACAGT 
4301 TACCAATGCT TAATCA&TCA CGCACCTATC TCAGCGATCT GTCTATTTCG 

4351 TT c at c cat a gttgcctgac tccccgtcgt gtagataact acgatacggg 

4401 AGGGCTTACC ATCTGGCCCC aGTGCTGCAA TGATACCGCG aGACCCACGC 

4451 TCACCGGCTC CAGATTTATC AGCAATAAAC CAGCCAGCCG GAAGGGCCGA 

4501 GCGCAGAAGT GGTCCTGCAA CTTTATCCGC CTCCATCCAG TCTATTAATT 

4551 GTTGCCGGGA AGCTAGAGTA AGTAGTTCGC CAGTTAATAG TTTGCGCAAC 

4601 GTTGTTGCCA TTGCTGCAGG CATCGTGGTG TCAC6CTCGT CGTTTGGTAT 

4651 GGCTTCATTC AGCTCCGGTT CCCAACGATC AAGGCGAGTT ACATGATCCC 
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FIG. 19 (cont'd) 



4701 CCATGTTGTG CAAAAA AGCG GTTAGCTCCT TCGGTCCTCC GATCGTTGTC 
4751 AGAAGTAAGT TGGCCGCAGT GTTATCACTC ATGGTTATGG CAGCACTGCA 
4801 TAATTCTCTT ACTGTCATGC CATCCGTAAG ATGCTTTTCT GTGACTGGTG 
4851 AGTACTCAAC CAAGTCATTC TGAGAATAGT GTATGCGGCG ACCGAGTTGC 
4901 TCTTGCCCGG CGTCAACACG GGATAATACC GCGCCACATA GCAGAACTTT 
4951 aaaaGTGCTC ATCATTGGAA AACGTTCTTC GGGGCGAAAA CTCTCAAGGA 
5001 TCTTACCGCT GTTGAGATCC AGTTCGATGT AACCCACTCG TGCACCCAAC 
5051 TGATCTTCAG CATCTTTTAC TTTCACCAGC GTTTCTGGGT GAGCAAAAAC 
5101 AGGAAGGCAA AATGCCGCAA AAAAGGGAAT AAGGGCGACA CGGAAATGTT 
S151 GAATACTCAT ACTCTTCCTT TTTCAATATT ATTGAAGCAT TTATCAGGGT 
5201 TATTGTCTCA TGAGCGGATA CATATTTGAA TGTATTTAGA AAAATAAACA 
5251 AATAGGGGTT CCGCGCACAT TTCCCCGAAA AGTGCCACCT GACGTCTAAG 
5301 AAACCATTAT TATCATGACA TTAACCTATA AAAATAGGCG TATCACGAGG 
5351 CCCTTTCGTC TTCAA 
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GAi" A ATTC 


CAGCTTGCTG 


5 1 


~ C C C C a GGC 7 


CCCCAGCAGG 


101 


GTCAGCAACC 


AGGTGTGGAA 


151 


TGCAAAGCAT 


GCATCTCAAT 


20 1 


CCGCCCATCC 


CGCCCCTAAC 


25 1 


TGGCTGACTA 


ATTTTTTTTA 


30 1 


CTGAGCTATT 


CCAGAAGTAG 


35 1 


GTATAGAAAC 


TCGGACCACT 


40 1 


CGAAGGAGGC 


TAAGTGGGAG 


45 l 


ACTCGCTCCA 


GGGTGTGAAG 


50! 


GGTGATTGGT 


TTAT AGGTGT 


55 1 


GGC 7 A T A A A A 


GGGGGTGGGG 


601 


CTGTCTGCGA 


GGGCCAGCTG 


€5 1 


CGGTCTTTCC 


AGTACTCTTG 


70 1 


CTCCGCCACC 


GAGGGACCTG 


75 1 


CTC TCGAGA A 


AGGCGTCTAA 


801 


CGTGGCGGGC 


GGCAGCGGGT 


851 


TGCTGATGAT 


GTAATTAAAG 


901 


GTGAGGTGTG 


GCAGGCTTGA 


951 


TGACATCCAC 


TTTGCCTTTC 


1001 


TGGATCCAAG 


CTTCGACTCG 


1051 


CCCACTGGGC 


TCCTGGTTGC 


A i U i 


GTTTGAGAAG 


CAGCGGGCAA 


1151 


TCTGTGGGCT 


CAGGTCCCTA 


1201 


GCCACAATGA 


ACCGGGGAGT 


1251 


ACTGGCGCTC 


CTCCCAGCAG 


1301 


AAAAAGGGGA 


TACAGTGGAA 


1351 


ATACAATTCC 


ACTGGAAAAA 


140 1 


GGGCTCCTTC 


TTAACTAAAG 




PC7/L S88/02940 



FIG. 20 



TGGAATGTGT GTCAGTTAGG GTGTGGaaaG 
CAGAAGTATG CAAAGCATGC ATCTCAATTA 
AGTCCCCAGG CTCCCCAGCA GGCAGAAGTA 
TAGTCAGCAA CCATAGTCCC GCCCCTAACT 
TCCGCCCAGT TCCGCCCATT CTCCGCCCCA 
TTTATGCAGA GGCCGAGGCC GCCTCGGCCT 
TGAGGAGGCT TTTTTGGAGG GGTCCTCCTC 
CTGAGACGAA GGCTCGCGTC CAGGCCAGCA 
GGGTAGCGGT CGTTGTCCAC T AGGGGGTCC 
ACACATGTCG CCCTCTTCGG CaTCaaGGAA 
AGGCCACGTG ACCGGGTGTT CCTGAAGGGG 
GCGCGTTCGT CCTCACTCTC TTCCGCATCG 
TTGGGCTCGC GGTTGAGGAC AAACTCTTCG 
GATCGGAAAC CCGTCGGCCT CCGAACGGTA 
AGCGAGTCCG CATCGACCGG ATCGGAAAAC 
CCAGTCACAG TCGCAAGGTA GGCTGAGCAC 
GGCGGTCGGG GTTGTTTCTG GCGGAGGTGC 
TAGGCGGTCT TGAGACGGCG GATGGTCGAG 
GATCGATCTG GCCATACACT TGAGTGACAA 
TCTCCACAGG TGTCCACTCC CAGGTCCAAC 
AGGAATTCCC CGAAGGAACA AAGCACCCTC 
AGAGCTCCAA GTCCTCACAC AGATACGCCT 
GAAAGACGCA AGCCCAGAGG CCCTGCCATT 
CTGGCTCAGG CCCCTGCCTC CCTCGGCAAG 
CCCTTTTAGG CACTTGCTTC TGGTGCTGCA 
CCACTCAGGG AAAGAAAGTG GTGCTGGGCA 
CT6ACCTGTA CAGCTTCCCA GAAGAAGAGC 
CTCCAACCAG ATAAAGATTC TGGGAAATCA 
GTCCATCCAA GCTGAATGAT CGCGCT^ACT 
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6 "« FIG. 20 (cont'd) 

1451 CAAGAAGAAG CTTGTGGGAC CAAGGAAACT TTCCCCTGAT CATCAAGAAT 

1501 C'TAAGATAG AAGACTCAGA TACTTACATC TGTGAAGTGG AGGACCAGAA 

155: GGAGGAGGTG CA ATT GCT AG 7GTTCGGATT GACTGCCAAC TCTGACACCC 

1601 ACC": _ TC» GGGGCAGAGC CTGACCCTGA CCTTGGAGAG CCCCCCTGGT 

165- tG-ic;::: - cagtgcaatg taggagtcc* AGGGGTaaaa acatacaggg 
17Q; ggggaa&acc ctctccctct ctcagctgga gctccaggat aGTGGCACC" 
1751 ggacatgcac tgtcttgcag aactgagatc tttgtgaagg aaccttactt 

5B0 1 CTGTGGTGTG ACATAATTGG ACAAACTACC TaCAGAGATT TAAAGCTCTA 
1851 AGGTAAATAT AAAATTTTTA AGTGTATAAT GTGTTAAACT ACTGATTCTA 
19C1 ATTGTTTGTG TATTTTAGAT TCCAACCTAT GGAACTGATG AATGGGAGCA 
195 1 GTGGTGGAAT GCCT TTA ATG AGGAAAACCT GTTTTGCTCA GAAGAAATGC 
2G0 1 CATCTAGTGA TGATGAGGCT ACTGCTGACT CTCAACATTC TACTCCTCCA 
205 1 AAAAAGAAGA GAAAGGTAGA AGACCCCAAG GACTTTCCTT CAGAATTGCT 
2 101 AAGTTTTTTG AGTCATGCTG TGTTTAGTAA TAGAACTCTT GCTTGCTTTG 
215 1 CTATTTACAC CACAAAGGAA A A AGCTGCAC TGCTATACAA GAAAATTATG 
2201 GAAAAATATT CTGT A ACCTT T AT AAGTAGG CATAACAGTT ATAATCATAA 
2251 CATACTGTTT TTTCTTACTC CACACAGGCA TAGAGTGTCT GCTATTAATA 
2301 ACT ATGCTCA AAAATTGTGT ACCTTT AGCT TTTTAATTTG TAAAGGGGTT 
2351 AATAA&GAAT ATTTGATGTA TAGTGCCTTG ACTAGAGATC ATAATCAGCC 
2401 ATACCACATT TG T AGAGGTT TTACTTGCTT TAAAAAACCT CCCACACCTC 
245 1 CCCCTGAACC TGAAACATAA AATGAATGCA ATTGTTGTTG TTAACTTGTT 
2501 TATT&CAGCT TAT AATGGTT ACA AATAAAG CAATAGCATC ACAAATTTCA 
2551 CAAATAAAGC ATTTTTTTC A CTGCATTCTA GTTGTGGTTT GTCCAAACTC 
2601 ATCAATGTAT CTT ATCATGT CTGGATCCTC TACGCCGGAC GCATCGTGGC 
265 1 CGGCATCACC GGCGCCACAG GTGCGGTTGC TGGCGCCTAT ATCGCCGACA 
2701 TCACCGATGG GGAAGATCGG GCTCGCCACT TCGGGCTCAT GAGCGCTTGT 
2751 TTCGGCGTGG GTATGGTGGC AGGCCCGTGG CCGGGGGACT GTTGGGCGCC 
2801 ATCTCCTTGC ATGCACCATT CCTTGCGGCG GCGGTGCTCA ACGGCCTCAA 
2851 CCTACTACTG GGCTGCTTCC TAATGCAGGA GTCGCATAAG GGAGAGCGTC 
2901 GACCGATGCC CTTGAGAGCC TTCAACCCAG TCAGCTCCTT CCGGTGGGCG 
2951 CGGGGCATGA CTATCGTCGC CGCACTTATG ACTGTCTTCT TTATCATGCA 
3001 ACTCGTAGGA CAGGTGCCGG CAGCGCTCTG GGTCATTTTC GGCGAGGACC 
30S1 GCTTTCGCTG GACCGCGACG ATGATCGGCC TGTCGCTTQC GGTATTCGGA 
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31C1 ATCTTGCACG CCCTCGCTCA 

315: TTTCGGCGAG aaGCaGGCCA 

320: GCTACG7C-T GC"GGCGTTC 

3251 a7Ga~"7TC ^CGCTTCCGG 

33C: GC*G*CCaGG Caggtagatg 

3351 TCGCGGCTCT TACCAGCC7A 

3401 GCGATTTatg CCGCC TCGGC 

345J AGGCGCCGCC C~AT ACCTTG 

3501 GGAGCCGGGC CACCTCGACC 

3551 GATTCACCAC TCCA AGA ATT 

3b0l aaTGCGCAAA CCAACCCTTG 

365: CAGCAGCCGC ACGCGGCGCA 

3701 AGGCTCCGCC CCCCTQACGA 

3751 GTGGCGAAAC CCGACAGGAC 

3801 GCTCCCTCGT GCGCTCTCCT 

3B51 TCCGCCTTTC TCCCTTCGGG 

3901 T AGGT ATCTC AGTTCGGTGT 

3951 ACGAACCCCC CGTTCAGCCC 

4001 CTTGAGTCCA ACCCGGTAAG 

405 1 TGGTAACAGG ATTAGCAGAG 

4 101 TGAAGTGGTG GCCTAACTAC 

4151 TGCGCTCTGC TGAAGCCAGT 

4201 ATCCGGC A A A CAAACCACCG 

4251 AGCAGATTAC GCGCAGAAAA 

4301 TCTACGGGGT CTGACGCTCA 

4351 GGT cat gaga ttatcaaaaa 

4401 AATGAAGTTT TAAATCAATC 

4451 AGTTACCAAT GCTTAATCAG 

4501 TCGtTCATCC ATAQTTGCCT 

4551 GGGAGGGCTT ACCATCTGGC 

4601 CGCTCACCGG CTCCAGATTT 

4651 CGAGCGCACA AGTGGTCCTG 

~?ni ATTGTTCCCG GGAAfcCTAGA 
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AUcl . T CGT C 
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CCACCAAACG 
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TGCAGGCCAT 




GGGAC AGC" 


C A A GG A T CGC 


AL TTCuATCA 


CTGGACCGC^ 


GATCGTCACG 


f~ A^^AfAY^^ 

bAUt AC A T 


AACGGGTTGG 


CATGGATTGT 


T T f f ^ ^* 


CGCG T TGCG T 


CGCGGTGCAT 


T ^ A A /*• A A 

'GA ATGGAAG 


CCGGCGGCAC 


CTCGCTAACG 




AATTCTTGCG 


GAGAACTGTG 


AUA AC ATA 


TCCATCGCGT 


CCGCCATCTC 


1 v f LuuuUCu 


CGTTGCTGGC 


GTTTTTCCAT 


fir atta^aaa 


AATCGACGCT 


CAAGTCAGAG 


T A T a a a a ▼ a 


CCAGGCGTTT 


CCCCCTGGAA 


r.TTrrr a t* r r 


TGCCGCTTAC 


CGGATACCTG 


A AfL/T'T/*/'/*** 


CTTTCTCAAT 


GCTCACGCTG 




CTCCAAGCTG 


GGCTGTGTGC 


UAUCuuTvtCu 


CCTTATCCGG 


TAACTATCGT 


A^A^^A#*^''T A 

alacgactta 


TCGCCACTGG 


CAGCAGCCAC 


CuAuuTATGT 


AGGCGGTGCT 


ACAGAGTTCT 


GGCTACACTA 


GAAGGACAGT 


ATTTGGTATC 


TACCTTCGGA 


AAAAGAGTTG 


GTAGCTCTTG 


CTGGTAGCGG 


TGGTTTTTTT 


GTTTGCAAGC 


A A A ^ ^ A 9 4* * a 

AAAGGATCTC 


AAGAAGATCC 


TTTGATCTTT 


riT^^ A A ^ ^ A A 

u TbuAACGAA 


AACTCACGTT 


AAGGGATTTT 


UV»* 'I'' CAt 


CT AQATCCTT 


TT A AATT AAA 


T A A A GT AT AT 


iTTATTi a AT 


TTGGTCTGAC 


TGAGGCACCT 


ATCTCAGCGA 


TCTGTCTATT 


GACTCCCCGT 


CGTGTAGATA 


ACTACGATAC 


CCCAGTGCTG 


CAATGATACC 


GCGAGACCCA 


ATCAGCAATA 


AACCAGCCAG 


CCGGAAGGGC 


CAACTTTATC 


CGCCTCCATC 


CAGTCTATTA 


GTAAGTAGTT 


CGCCAGTTAA 


TAGTTTGCGC 
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FIG. 20 (cont'd) 



4751 AACG7TGTTG CCATTGCTGC AGGCATCGTG GTGTCACGCT CGTCGTTTGG 
4801 TATGGCTTCA TTCAGCTCCG GTTCCCAACG ATCAAGGCGA GTTACATGAT 
4851 CCCCCATGTT GTGC A A AA A A GCGGTTAGCT CCTTCGGTCC TCCGATCGTT 
4901 GTCAGAAGTA AGTTGGCCGC AGTGTTATCA CTCATGGTTA TGGCAGCACT 
4951 *GCATAATTCT CTTACTGTCA TGCCATCCGT AAGATGCTTT TCTGTGACTG 
5001 GTGAGTACTC AACCAAGTCA TTCTGAGAAT AGTGTATGCG GCGACCGAGT 
5051 TGCTCTTGCC CGGCGTCAAC ACGGGATAAT ACCGCGCCAC ATAGCAGAAC 
510 r TTTAAAAGTG CTCATCATTG GAA AACGTTC TTCGGGGCGA AAACTCTCAA 
5 151 GGATCTTACC GCTGTTGAGA TCCAGTTCGA TGTAACCCAC TCGTGCACCC 
520 1 AACTGATCTT CAGCATCTTT TACTTTCACC AGCGTTTCTG GGTGAGCAAA 
525 1 AACAGGAAGG CAA AATGCCG CAAAAAAGGG AATAAGGGCG ACACGGAAAT 
530 1 GTTGAATACT C AT A CTCTTC CTTTTTCAAT ATTATTGAAG CATTTATCAG 
535 1 GGTTATTGTC TCATGAGCGG ATACATATTT GAATGTATTT AGAAAAATAA 
5401 ACAAATAGGG GTTCCGCGCA CATTTCCCCG AAAAGTGCCA CCTGACGTCT 
5451 AAGA^ACCAT TATTATCATG ACATTAACCT ATAAAAATAG GCGTATCACG 
5501 AGGCCCTTTC G^CTTCAA 
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:SG366 Oackoone 

: $0 1 uO 1 € T4#6 
:AA #3 » LYS 

:"0« r f«Ct" Stu/first 182 A A of T 
: cas icai < y uO to V2 J2 

og392.sec length: 5566 



X 


GA A TT A ATT C 


CAGCTTGCTG 


D 1 


^ r r f r » r* r* r* ^ 


CCCCAGCAGG 






AGGTGTGGAA 


i 5 i 


T f* a a *rr *t 


GCATCTCAAT 


1 


CCUC L v- A T t L 


CGCCCC T AAC 


1 

*w 1 


1 Vjut I uA U A 


ATTTTTTTTA 


OU 1 


U t UAUI 1 A 1 1 


P P a ^ a a T * 

AGAAGTAG 


1 


U 1 A 1 * UA AAt 


T CGGACCACT 


4H 1 


c c* a a c*c. a r*c. r 


TA AGTGGGAG 


45 1 


act c c*ct r c a 


uuu T UTuA AG 


50 i 

WW i> 


WO i uA 1 1 VJU I 


t T AT AGGTGT 


W 3 1 


r.r.CTATA AAA 
WOW i M 1 AAAA 


uGGGGTGGGG 


fin i 


rTfiTrTftrr a 

v t u > w I VJtuA 


GGGCCAGCTG 






AGTACTCTTG 




ct c c c+c c ac c 


GAGGGACC T G 


751 




a f*f" /"P TTT A A 


80 1 






851 


TGCTGATGAT 


GTAATTAAAG 


901 


GTGAGGTGTG 


GCAGGCTTGA 


951 


TGACATCCAC 


TTTGCCTTTC 


1001 


T&GATCCAAG 


CTTCGACTCG 


1051 


CCCACTGGGC 


TCCTGGTTGC 


1101 


GTTTGAGAAG 


CAGCGGGCAA 


1151 


TCTGTGGGCT 


CAGGTCCCTA 


1201 


GCCACAATGA 


ACCGGGGAGT 


1251 


ACTGGCGCTC 


CTCCCAGCAG 


1301 


AAAAAGGGGA 


TACAGTGGAA 


1351 


ATACAATTCC 


ACTGGAAAAA 



FIG. 21 



TGGAATGTGT GTCAGTTAGG GTGTGGaaaG 
CAGAAGTATG CAAAGCATGC ATCTCAATTA 
AGTCCCCAGG CTCCCCAGCA GGCAGAAGTA 
TAGTCAGCAA CCATAGTCCC GCCCCTAACT 
TCCGCCCAGT TCCGCCCATT CTCCGCCCCA 
TTTATGCAGA GGCCGAGGCC GCCTCGGCCT 
TGAGGAGGCT TTTTTGGAGG GGTCCTCCTC 
CTGAGACGAA GGCTCGCGTC CAGGCCAGCA 
GGGTAGCGGT CGTTGTCCAC TAGGGGGTCC 
ACACATGTCG CCCTCTTCGG CATCAAGGAA 
AGGCCACGTG ACCGGGTGTT CCTGAAGGGG 
GCGCGTTCGT CCTCACTCTC TTCCGCATCG 
TTGGGCTCGC GGTTGAGGAC AAACTCTTCG 
GATCGGAAAC CCGTCGGCCT CCGAACGGTA 
AGCGAGTCCG CATCGACCGG ATCGGA AAAC 
CCAGTCACAG TCGCAAGGTA GGCTGAGCAC 
GGCGGTCGGG GTTGTTTCTG GCGGAGGTGC 
TAGGCGGTCT TGAGACGGCG GATGGTCGAG 
GATCGATCTG GCCATACACT TGAGTGACAA 
TCTCCACAGG TGTCCACTCC CAGGTCCAAC 
AGGAATTCCC CGAAGGAACA AAGCACCCTC 
AGAGCTCCAA GTCCTCACAC AGATACGCCT 
GAAAGACGCA AGCCCAGAGG CCCTGCCATT 
CTGGCTCAGG CCCCTGCCTC CCTCGGCAAG 
CCCTTTTAGG CACTTGCTTC TGGTGCTGCA 
CCACTCAGGG AAAGAAAGTG GTGCTGGGCA 
CTGACCTGTA CAGCTTCCCA GAAGAAGAGC 
CTCCAACCAG ATAAAGATTC TGGGAAATCA 
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Pi 


14C: 


GGGCTCCTTC 


t i 

TT A ACT A A AG 


245 I 


CA AGA AGA AG 


C i i ui GGGAC 


:5c: 


C T T A AGAT AG 


AAGACTCAGA 


:55: 


GGA gg a ggt g 


CAATTGCTAG 


1601 


ACCTGCTTCA 


GGGGC AGAGC 


1651 


AG7AGCCCCT 


CAG7GCAATG 


1701 


GGGGAAGACC 


CTCTCCGTGT 


1751 


GGACATGC A c 


TGTCTTGCAG 


1801 


ATCGTGGTGC 


TAGCTTTCCA 


1851 


GTGGTGTGAC 


ATAATTGGAC 


1901 


GTAAATATAA 


AATTTTTAAG 


1951 


TGTTTGTGTA 


TTTTAGATTC 


2001 


GGTGGA ATGC 


CTTTAATGAG 


2051 


TCTAGTGATG 


ATGAGGCTAC 


2101 


AAAGAAGAGA 


AAGGTAGAAG 


2151 


GTTTTTTGAG 


TCATGCTGTG 


2201 


ATTTACACCA 


CAAAGGAAAA 


2251 


AAAATATTCT 


GTAACCTTTA 


2301 


TACTGTTTTT 


TCTTACTCCA 


2351 


TATGCTCAAA 


AATTGTGTAC 


2401 


TAAGGAATAT 


TTGATGTATA 


2451 


ACCACATTTG 


TAGAGGTTTT 


2501 


CCTGAACCTG 


A A AC A T A AAA 


2551 


TTGCAGCTTA 


TAATGGTTAC 


2601 


AAT AAAGCAT 


T7TTTTCACT 


265 1 


CAATGTATCT 


TATCATGTCT 


2701 


GCATCACCGG 


CGCCACAGGT 


2751 


ACCGATGGGG 


AAGA7CGGGC 


2801 


CGGCGTGGGT 


ATGGTGGCAG 


2851 


CTCCTTGCAT 


GCACCATTCC 


2901 


TACTACTGGG 


CTGCTTCCTA 


2951 


CCGATGCCCT 


TGAGAGCCTT 


3001 


GGGCATGACT 


ATCGTCGCCG 



G. 21 (cont'd) 

GTCCATCCAA gctgaatgat CGCGCTGACT 

CAAGGAAACT TTCCCC7GAT CATCAAGAAT 
TACT7ACATC TGTGAAGTGG AGGACCAGAA 
TGTTCGGATT GACTGCCAAC "C7GAC ACCC 
CTGACCCTGA CCTTGGAGAG CCCCCCTGGT 
TAGGAGTCCA AGGGGTA A A A AC AT A C AGGG 
CTCAGCTGGA GCTCCAGGAT AGTGGCACCT 
AACCAGAAGA AGGTGGAGTT CA A A AT AGAC 
GTGAGATCTT TGTGAAGGAA CCTTACTTCT 
AAACTACCTA CAGAGATTTA AAGCTCTAAG 
TGTATAATGT GTTAAACTAC TGATTCTAAT 
CAACCTATGG AACTGATGAA TGGGAGCAGT 
GAAAACCTGT TTTGCTCAGA AGAAATGCCA 
TGCTGACTCT CAACATTCTA CTCCTCCAAA 
ACCCCAAGGA CTTTCCTTCA GAATTGCTAA 
TTTAGTAATA GAACTCTTGC TTGCTTTGCT 
AGCTGCACTG CTATACAAGA AAATTATGGA 
TAAGTAGGCA TAACAGTTAT AATCAT AACA 
CACAGGCATA GAGTGTCTGC TATTAATAAC 
CTTTAGCTTT TTAATTTGTA AAGGGGTTAA 
GTGCCTTGAC TAGAGATCAT AATCAGCCAT 
ACTTGCTTTA A AAAACCTCC CACACCTCCC 
TGAATGCAAT TGTTGTTGTT AACTTGTTTA 
AAATAAAGCA ATAGCATCAC AAATTTCACA 
GCATTCTAGT TGTGGTTTGT CCAAACTCAT 
GGATCCTCTA CGCCGGACGC ATCGTGGCCG 
GCGGTTGCTG GCGCCTATAT CGCCGACATC 
TCGCCACTTC GGGCTCATGA GCGCTTGTTT 
GCCCGTGGCC GGGGGACTGT TGGGCGCCAT 
TTGCGGCGGC GGTGCTCAAC GGCCTCAACC 
ATGCAGGAGT CGCATAAGGG AGAGCGTCGA 
CAACCCAGTC AGCTCCTTCC GGTGGGCGCG 
CACTTATGAC TGTCTTCTTT ATCATGCAAC 
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FIG. 21 (cont'd) 

3051 TCGTAGGACA GG7GCCGGCA GCGCTCTGGG TCATTTTCGG CGAGGACCGC 

310: T-TCGCTGGA GCGCGACGAT GATCGGCCTG TCGC7TGCGG TATTCGGAA? 

2:e: cttgcacgcc ctcgctcaag ccttcgtcac tggtcccgcc ACCAAACGTT 

22C: TCGGCGAGAA GCAGGCCATT ATCGCCGGCA TGGCGGCCGA CGCGCTGGGC 

325: tacgtcttg: tggcgttcgc gacgcgaggc tggatggcct tccccattat 

230- GATTCT7CTC gcttccggcg gcatcgggat gcccgcgttg caggccatgc 

3251 7GTCCAGGCA GGTAGATGAC GACCATCAGG GaCAGCTTCA AGGATCGCTC 

3401 GCGGCTCTTA CCAGCC7AAC TTCGATCACT GGACCGCTGA TCGTCACGGC 

3451 GATTTATGCC GCCTCGGCGA GCACATGGAA CGGGTTGGCA TGGATTGTAG 

3501 GCGCCGCCCT ATACCTTGTC TGCCTCCCCG CGTTGCGTCG CGGTGCATGG 

3551 AGCCGGGCCA CCTCGACCTG AATGGAAGCC GGCGGCACCT CGC7AACGGA 

3601 TTCACCACTC CAAGAATTGG AGCCAATCAA TTCTTGCGGA GAACTGTGAA 

3651 TGCGCAAACC AACCCTTGGC AGAACATATC CATCGCGTCC GCCATCTCCA 

370: GCAGCCGCAC GCGGCGCATC TCGGGCCGCG 77GC7GGCG7 TTTTCCATAG 

3751 GCTCCGCCCC CCTGACGAGC ATCACAAAAA TCGACGCTCA AGTCAGAGGT 

3801 GGCGAAACCC GACAGGACTA T AAAGATACC AGGCGTTTCC CCCTGGAAGC 

3851 TCCCTCGTGC GCTCTCCTGT TCCGACCCTG CCGCTTACCG GATACCTGTC 

3901 CGCCTTTCTC CCTTCC- >GAA GCGTGGCGCT TTCTCAATGC TCACGCTGTA 

3951 GGTATCTCAG TTCGGTGTAG GTCGTTCGCT CCAAGCTGGG CTGTGTGCAC 

4001 GAACCCCCCG TTCAGCCCGA CCGCTGCGCC TT ATCC6GTA ACTATCGTCT 

4051 TGAGTCCAAC CCGGTAAGAC ACGACTTATC GCCACTGGCA GCAGCCACTG 

4 101 GTAACAGGAT TAGCAGAGCG AGGTATGTAG GCGGTGCTAC AGAGTTCTTG 

4 151 AAGTGGTGGC CTAACTACGG CTACAC7AGA AGGACAG7A7 77GG7A7C7G 

4201 CGC7C7GC7G AAGCCAG77A CC77CGGAAA AAGAG77GG7 AGC7C77GA7 

4251 C CGGC A A A C A AACCACCGC7 GG7AGCGG7G G7777777G7 77GC AAGCAG 

4301 CAGA77ACGC GCAGAAAAAA AGGA7C7CAA GAAGA7CC77 7GA7C7777C 

4351 7ACGGGG7C7 GACGC7CAG7 GGAACGAAAA C7CACG77AA GGGA7777GG 

4401 7CA7GAGA77 A7C AAAAAGG ATCT7CACC7 AGA7CC7777 AAA77 A AAAA 

4451 7GAAG7777A AA7CAATCTA AAGTATATAT GAGTAAACTT GGTC7GACAG 

450 X 77ACCAA7GC 7TAATCAGTG AGGCACCTAT CTCAGCGATC TGTCTATTTC 

4551 GTTCATCCAT AGTTGCCTGA CTCCCCGTCG TGTAGATAAC TACGATACGG 

4601 GAGGGCT7AC CATCTGGCCC CAGTGCTGCA ATGA7ACCGC GAGACCCACC 

4651 C7CACCGGC7 CCAGATTTAT CAGCAATAAA CCAGCCAGCC GGAAGGGCCG 



WO 89/01940 



PCT/LS88/0:940 



67/93 



FIG. 21 (cont'd) 



4701 AGCGCAGAAG TGGTCCTGCA ACTTTATCCG CCTCCATCCA GTCTATTAAT 
4751 TGTTGCCGGG AAGCTAGAGT AAGTAGTTCG CCAGTTAATA GTTTGCGCAA 
4801 CGTTGTTGCC ATTGCTGCAG GCATCGTGGT GTCACGCTCG TCGTTTGGTA 
4851 TGGCTTCATT CAGCTCCGGT TCCCAACGAT CAAGGCGAGT TACATGATCC 
4.901 CCCATGTTGT GCAAAAAAGC GGTTAGCTCC TTCGGTCCTC CGATCGTTGT 
4951 CAGAAGTAAG TTGGCCGCAG TGTTATCACT CATGGTTATG GCAGCACTGC 
5001 ATAATTCTCT TACTGTCATG CCATCCGTAA GATGCTTTTC TGTGACTGGT 
5051 GAGTACTCAA CCAAGTCATT CTGAGAATAG TGTATGCGGC GACCGAGTTG 
5101 CTCTTGCCCG GCGTCAACAC GGGATAATAC CGCGCCACAT AGCAGAACTT 
5 151 TAAAAGTGCT CATCATTGGA AAACGTTCTT CGGGGCGAAA ACTCTCAAGG 
5201 ATCTTACCGC TGTTGAGATC CAGTTCGATG TAACCCACTC GTGCACCCAA 
5251 CTGATCTTCA GCATCTTTTA CTTTCACCAG CGTTTCTGGG TGAGCAAAAA 
5301 CAGGAAGGCA AAATGCCGCA AAAAAGGGAA TAAGGGCGAC ACGGAAATGT 
5351 TGAATACTCA TACTCTTCCT TTTTCAATAT TATTGAAGCA TTTATCAGGG 
5401 T1ATTGTCTC ATGAGCGGAT ACATATTTGA ATGTATTTAG AAAAATAAAC 
5451 aaataGGGGT TCCGCGCACA TTTCCCCGAA AAGTGCCACC TGACGTCTAA 
5501 GAAACCATTA T7 AT CATGAC ATTAACCTAT AAAAATAGGC GTATCACGAG 
5551 GCCCTTTCGT CTTCAA 
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: 




CAGCTTGCTG 


TGGAATGTGT 


GTCAGTTAGG 


GTGTGGAAAG 


- : 


tccc: aggc* 


* * - 


CAGAAGTATG 


CAAAGCATGC 


atctcaatta 


; : j 


G t C>*GCa aCC 


aGGTGTGGA a 


AGTCCCCAGG 


CTCCCCAGCA 


GGCAGAAGTA 


: 5i 


TGCA A AGC A* 


GCATCTCAA^ 


TAGTCAGCAA 


CCATAGTCCC 


GCCCCTAACT 


:o j 


CCGCCC ATCC 


CGCCCCTAAC 


TCCGCCCAGT 


TCCGCCCATT 


CTCCGCCCCA 


25 i 


TGGCTGACTA 


ATTTTTTTTA 


TTTATGCAGA 


GGCCGAGGCC 


GCCTCGGCCT 


3C : 


C TGAGCTATT 


CCAGAAGTAG 


TGAGGAGGCT 


TTTTTGGAGG 


GGTCCTCCTC 


3 5 1 


GT A 7 A (j A A A C 


T CGGACCACT 


CTGAGACGAA 


GGCTCGCGTC 


CAGGCCAGCA 


4c: 


CGaaGGaGGC 


TAAGTGGGAG 


GGGTAGCGGT 


CGTTGTCCAC 


TAGGGGGTCC 


451 


ACTCGCTCCA 


GGGTGTGAAG 


ACACATGTCG 


CCCTCTTCGG 


CATC A AGGA A 


50 1 


GG'GATTGGT 


TTATAGGTGT 


AGGCCACGTG 


ACCGGGTGTT 


CCTGAAGGGG 


55: 


GGCTAT AAAA 


GGGGGTGGGG 


GCGCGTTCGT 


CCTCACTCTC 


TTCCGCATCG 


601 


ctgtctgcga 


GGGCC AGCTG 


TTGGGCTCGC 


GGTTGAGGAC 


AAACTCTTCG 


t>5l 


CGGTCTTTCC 


AGTACTCTT& 


GATCGGAAAC 


CCGTCGGCCT 


CCGAACGGTA 


701 


CTCCGCCACC 


GAGGGACCTG 


AGCGAGTCCG 


CATCGACCGG 


ATCGGAAAAC 


75 : 


C'CTCGAGAA 


AGGCGTCTAA 


CCAGTCACAG 


TCGCAAGGTA 


GGCTGAGCAC 


B01 


CGTGGCGGGC 


GGCAGCGGGT 


GGCGGTCGGG 


GTTGTTTCTG 


GCGGAGGTGC 


esi 


TGCTGATGAT 


GTAATTAAAG 


TAGGCGGTCT 


TGAGACGGCG 


GATGGTCGAG 


901 


GTGAGGTGTG 


GCAGGCTTGA 


GATCGATCTG 


GCCATACACT 


TGAGTGACAA 


95: 


TGACATCCAC 


TTTGCCTTTC 


TCTCCACAGG 


TGTCCACTCC 


CAGGTCCa^: 


1 00 1 


TGCATCCA AG 


CTTCGACTCG 


AGGAATTCCC 


CGA AGGA AC A 


AAGCACCCTC 


;05 i 


CCCACTGGGC 


TCCTGGTTGC 


AGAGCTCCAA 


GTCCTCACAC 


AGATACGCCT 


i :o i 


GTTTGAGA AG 


C AGCGGGC A A 


GAAAGACGCA 


AGCCCAGAGG 


CCCTGCCAT7 


::s>i 


TCTGTGGGCT 


CAGGTCCCTA 


CTGGCTCAGG 


CCCCTGCCTC 


CCTCGGCAAG 


1201 


GCCAC AATGA 


ACCGGGGAGT 


CCCTTTTAGG 


CACTTGCTTC 


TGGTGCTGCA 


1251 


ACTGGCGCTC 


CTCCCAGCAG 


CCACTCAGGG 


AAAGAAAGTG 


GTGCTGGGCA 


1301 


AAAAAGGGGA 


TACAGTGGAA 


CTGACCTGTA 


CAGCTTCCCA 


GAAGAAGAGC 


1351 


ATACAATTCC 


aC t GGaaaaa 


CTCCAACCAG 


ATAAAGATTC 


TGGGAAATCA 


14C; 


GGGCTCCTTC 


ttaactaaag gtccatccaa 


GCTGAATGAT 


CGCGCTGACT 
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145: 


CAAGAAGAAG 


CTTGTGGGAC 


CAAGGAAACT 


TTCCCCTGAT 


CATCAAGAAT 




CTTAAGATAG 


AAGACTCAGA 


TACTTACATC 


TGTGAAGTGG 


AGGACCAGAA 


155 ; 


GGAGGAGGTG 


CAATTGCTAG 


TGTTCGGATT 


GACTGCCAAC 


TCTGACACCC 


16G : 


aCC~G"TCa 


GGGGC AGAGC 


CTGACCCTGA 


CCTTGGAGAG 


CCCCCCGGGT 


165 i 


AGTAGCCCCT 


CAGTGCAATG 


AGATCTTTGT 


GAAGGAACCT 


TACTTCTGTG 


1701 


GTGTGACATA 


A TTGGAC A A A 


CTACCTACAG 


AGATTTAAAG 


CTCTAAGGTA 


1751 


AATATAAAAT 


TTTTAAGTGT 


ATAATGTGTT 


AAACTACTGA 


TTCTAATTGT 


IBOl 


TTGTGTATTT 


TAGATTCCAA 


CCTATGGAAC 


TGATGAATGG 


GAGCAGTGGT 


1851 


GGAATGCCTT 


TA ATGAGGA A 


AACCTGTTTT 


GCTCAGAAGA 


AATGCCATCT 


1901 


AGTGATGATG 


AGGCTACTGC 


TGACTCTCAA 


CATTCTACTC 


CTCCA AAA AA 


1951 


GA AGAGAA AG 


GTAGAAGACC 


CCAAGGACTT 


TCCTTCAGAA 


TTGCTAAGTT 


2001 


TTTTGAGTCA 


TGCTGTGTTT 


AGTAATAGAA 


CTCTTGCTTG 


C T TTGCTATT 


2051 


TACACCACAA 


AGGAAAAAGC 


TGCACTGCTA 


TACAAGAAAA 


TTATGGAAAA 


2101 


ATATTCTGTA 


ACCTTTATAA 


GTAGGCATAA 


CAGTTATAAT 


CATAACATAC 


2 15 1 


TGTTTTTTCT 


TACTCCACAC 


AGGCATAGAG 


TGTCTGCTAT 


TAATAACTAT 


220] 


GCTCAAAAAT 


TGTGTACCTT 


TAGCTTTTTA 


ATTTGTAAAG 


GGGTT A AT A A 


2251 


GGA AT ATTTG 


ATGTATAGTG 


CCTTGACTAG 


AGATCATAAT 


CAGCCATACC 


2301 


ACATTTGTAG 


AGGTTTTACT 


TGCTTTAAAA 


AACCTCCCAC 


ACCTCCCCCT 


2351 


GAACCTGAAA 


C AT A A A ATGA 


ATGCAATTGT 


TGTTGTTAAC 


TTGTTTATTG 


2401 


CAGCTTATAA 


TGGTTACAAA 


TAAAGCAATA 


GCATCACAAA 


TTTCACAAAT 


2451 


AAAGCATTTT 


TTTCACTGCA 


TTCTAGTTGT 


GGTTTGTCCA 


AACTCATCAA 


2501 


TGTATCTTAT 


CATGTCTGGA 


TCCTCTACGC 


CGGACGCATC 


GTGGCCGGCA 


2551 


TCACCGGCGC 


CACAGGTGCG 


GTTGCTGGCG 


CCTATATCGC 


CGACATCACC 


2601 


GATGGGGAAG 


ATCGGGCTCG 


CCACTTCGGG 


CTCATGAGCG 


CTTGTTTCGG 


2651 


CGTGGGTATG 


G7GGCAGGCC 


CGTGGCCGGG 


GGACTGTTGG 


GCGCCATCTC 


2701 


CTTGCATGCA 


CCATTCCTTG 


CGGCGGCGGT 


GCTCAACGGC 


CTCAACCTAC 


275 : 


T ACTGGGCTG 


CTTCCTAATG 


CAGGAGTCGC 


AT AAGGGAGA 


GCGTCGACCG 


2801 


ATGCCCTTGA 


GAGCCTTCAA 


CCCAGTCAGC 


TCCTTCCGGT 


GGGCGCGGGG 


2851 


CATGACTATC 


GTCGCCGCAC 


TTATGACTGT 


CTTCTTTATC 


ATGCAACTCG 


2901 


T AGGACAGGT 


GCCGGCAGCG 


CTCTGGGTCA 


TTTTCGGCGA 


GGACCGCTTT 


2951 


CGCTGGAGCG 


cgacgatgat 


CGGCCTGTCG 


CTTGCGGTAT 


TCGGAATCTT 


3001 


ccacgccctc 


GCTCAAGCCT 


TCGTCACTGG 


TCCCGCCACC 


AAACGTTTCG 


3C51 


gcgagaag; a 


GCCCATTATC 


GCCGGCATGG 


CGGCCGACGC 


GCTGGGCTAC 
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FJG.22(cont'd) 

wuCvjAC GCGAGGCTGG ATGGCCT7< 



3101 GTCTTGCTGG CGTTCGCGAC 'GCGAGGCTGG ATGGCCT7CC CCATTATGAT 

315: TCTTCTCGC- TCwGGCGGCA TCGGGATGCC CGCGTTGCAG GCCATGCTGT 

32C1 CCAGGCAGGT A&AT G ACGAC CATCAGGGAC AGCTTCAAGG ATCGCTCGCG 

325; GC-CT-ac:* GCCTaacTtc GATCACTGGA CCGCTGATCG TCACGGCGAT 

33c: "itgccgcc tcggcgagca catggaa cgg gttggc a tgg attgtaggcg 

335: ccgcc:tat a ccttgtctgc ctccccgcgt tgcgtcgcgg tgcatggagc 

340: cgggccacct cgacctgaat ggaagccggc ggcacctcgc taacggattc 

3451 accactccaa gaattggagc caatcaattc ttgcggagaa ctgtgaatgc 

3501 GCAAACCAAC CCTTGGCAGA ACATATCCAT CGCGTCCGCC ATCTCCAGCA 

3551 GCCGCACGCG GCGCaTCTCG GGCCGCGTTG CTGGCGTTTT TCCATAGGCT 

3601 CCGCCCCCCT GACGAGCATC ACAAAAATCG ACGCTCAAGT CAGAGGTGGC 

3651 GAAACCCGAC AGGACT ATAA AGATACCAGG CGTTTCCCCC TGGAAGCTCC 

3701 CTCGTGCGCT CTCCTGTTCC GACCCTGCCG CTTaCCGGAT ACCTGTCCGC 

3751 ctttctccct tcgggaagcg tggcgctttc tcaatgctca cgc-gtaggt 

3801 6TC-CAGTTC GG7GTAGGTC GTTCGCTCCA AGCTGGGCTG t GTGCACGAA 

386 1 CCCCCCGTTC agcccgaccg ctgcgcctta tccggtaact atcgtcttga 

3901 GTCCaaCCCG GTaagaCACG ACTTATCGCC ACTGGCAGCA GCCACTGGTa 

395 1 ACAGGATTAG CAGAGCGAGG TATGTAGGCG GTGCTACA&a GTTCTTGAAG 

AUOl TG&1GGCCTA ACT ACGGCT A CACTAGAAGG ACAGTaTTTG GTATCTGCGC 

4051 TCtGCTGAAG CCAGTTACCT TCGGAAAAAG AGTTGGTAGC TCTTGATCCG 

4 10 1 GC A A AC A A AC CACCGCTGGT AGCGGTGGTT TTTTTGTTTG CAAGCAGCAG 

4 15 1 -iTTACGCGCA GAAAA AAAGG ATCTCAAGAA GATCCTTTGA TCTTTTCTAC 

<201 GGGGTCTGAC GCTCAGTGGA ACGAAAACTC ACGTTaaGGG ATTTTGGTCA 

425 1 T&AGATTATC AAA A AGGATC TTCACCTAGA TCCTTTTAAA TTAAAAATGA 

430- AGTTTTAAAT CAATCTAAAG TATATATGAG TAAACTTGGT CTGACAGTTA 

«3£l CCttTGCTTA ATCAGT&AG& CACCTATCTC AGCGATCTGT CTATTTCGTT 

440) CATCCATAGT tgcctgactc cccgtcgtgt AGATAACTAC GATACGGGAG 

4451 GGCTTACCAT CTGGCCCCAG TGCTGCAATG ATACCGCGAG ACCCACGCTC 

450 3 ACCUGCTCCA GATTTATCAG CAATAAACCA GCCAGCCGGA AGGGCCGAGC 

4551 GCAGAAGTGG TCCTGCAACT TTATCCGCCT CCATCCAGTC TATTAATTGT 

4601 TGCCGGGAAG CTAGAGTAAG TAGTTCGCCA GTTAATAGTT tgcgcaacgt 

4651 TGTTGCCATT GCTGCaGGCA TC6TGGTGTC ACGCTCGTCG TTTGGTATGG 

4701 CTTCiTTCAG CTCCGGTTCC CAACGATCAA GGCGAGTTAC ATGATCCCCC 
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FIG, 22(cont'd) 



4751 *TGTTGTGCA AAAAAGCGGT TAGCTCCTTC GGTCCTCCGA TCGTTGTCAG 
480i AAGTAAGTTG GCCGCAGTGT TATCACTCAT GGTTATGGCA GCACTGCATA 
4851 *TTCTCTTAC TGTCATGCCA TCCGTAAGAT GCTTTTCTGT GACTGGTGAG 
4901 TACTCAACCA AGTCATTCTG AGAATAGTGT ATGCGGCGAC CGAGTTGCTC 
4 953 TTGCCCGGCG TCAAC ACGGG ATA-ATACCGC GCCACATAGC AGAACTTTAA 
5001 AAGTGCTCAT CATTGGAAAA CGTTCTTCGG GGCGAAAACT CTCAAGGATC 
5051 TTACCGCTGT TGAGATCCAG TTCGATGTAA CCCACTCGTG CACCCAACTG 
5101 ATCTTCAGCA TCTTTTACTT TCACCAGCGT TTCTGGGTGA GCAAAAACAG 
515! GAAGGCAAAA TGCCGCAAAA A AGGGAATAA GGGCGACACG GAAATGTTGA 
5201 ATACTCATAC T C TTCCTTTT -CAATATTAT TGAAGCATTT ATCAGGGTTA 
5251 TTGTCTCATG AGCGGA-ACA TA7TTGAATG TATTTAGAAA AATAAACAAA 
5301 TAGGGGTTCC GCGC ACATTT CCCCGAAAAG TGCCACCTGA CGTCTAAGAA 
5351 ACCATTATTA TCATGACATT AACCTATAAA AATAGGCGTA TCACGAGGCC 
5401 CTTTCGTCTT CAA 
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FIG. 25 
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FIG. 29 B 
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